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INEQUALITIES FOR THE ZEROS OF LEGENDRE 
POLYNOMIALS AND RELATED FUNCTIONS* 


BY 
GABRIEL SZEGO 


INTRODUCTION 


In what follows we deal mainly with some inequalities for the zeros of 
Legendre polynomials P,(cos 6), given by Bruns [1],f and later independ- 
ently by A. Markoff [6] and by Stieltjes [8], improving the results of Bruns. 


Let 6;, 42, - - - , 0, denote the zeros of P,(cos @) in the interval (0, 7) in in- 
creasing order, so that 
(1) 


Then the inequalities of Bruns can be formulated as follows: 


1 
v 

<0,< (v=1,2,---,m). 


(2) 


The improved inequalities due both to A. Markoff and Stieltjes are 


(v=1,2,--- [n/2]). 


n n+1 


(3) 


This concerns only the group of zeros lying in the interval 0<0<7/2. The 
symmetric property 


(4) 4, + On+1—» = 


yields however a similar estimate for the second group of zeros in the interval 

These inequalities indicate in particular the “regular distribution” of the 
systems of zeros in the interval (0, 7) if n>. 

We show in the first part of this paper how the inequalities (2), (3) can 
be derived in a very simple way using the classical ideas of Sturm and the 
well known differential equation satisfied by the function P,(cos @). The 
second part contains some elementary facts about the zeros of a class of 
trigonometric polynomials and gives also some inequalities for the zeros of 
P,,(cos 0) derived on the basis of these facts. 


* Presented to the Society, September 13, 1935; received by the editors February 2, 1935. 
+t Numbers in bold face type refer to the Bibliography at the end of this paper. 
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We start in §I with a formulation of a theorem of Sturm adapted to our 
later needs. In §II we prove (2) and (3) by means of Sturm’s method obtain- 
ing the lower estimate of (3) in even a sharper form, [(v—4)x/(m+4) in- 
stead of (v—})/n]. The proof of the upper bound is based on the following 
remarkable property of the zeros in question, which is a simple consequence 
of Sturm’s theorem. The sequence 


(5) O = 60, 61, O2,---, 95415 [n/2], 


is convex, that is to say, the differences 0,—0,-1 are increasing if v runs from 1 
to p+1. [Cf. the hint in Hille 5, p. 162.] 

§III treats of some analogous properties of the Bessel function Jo(6) by 
Sturm’s method. We obtain some new inequalities for the zeros 0, in terms of 
the zeros of Jo(@). There is no difficulty in extending these results to some 
ultra-spherical polynomials and to general Bessel functions (§IV). 

In the second part we first consider trigonometric cosine polynomials 


(6) Ao cos mt + A, cos (m — 1)t+--- +Am_1 cost 
with positive and monotonically increasing coefficients: 


(7) Ar > > > Am > Oz 


Pélya has shown [7, p. 359], by a simple application of the principle of the 
argument, that the zeros of such a polynomial are all real and simple. We 
prove that, under the condition mentioned (and even if equality holds in 
place of all inequalities (7) except the first), every interval 


1 
(8) (u = 1 2,---,m) 
+ 
contains exactly one of these zeros (§V). These inequalities yield, under a rather 
general condition, the “regular distribution” of the zeros for large values of m. 
The extremely simple proof is based on the classical fact that the sums 


(9) sin 3¢ + sin --- + sin (m+ 4)t (m = 0, 1, 2, 3,---) 


are positive in the interval 0 <¢<27. As an application of our result we derive 
very simply the main theorem of Pélya’s paper quoted above (§VI). 
Legendre polynomials P,,(cos @) are not exactly of the form (6), (7). Our 
method gives however also in this case some inequalities for the zeros which 
are only a little less precise than those of Bruns. The same method can be 
applied to some generalizations of Legendre polynomials due to Fejér [3]. 
In special cases an improvement oi these results can be easily obtained. We 
prove for instance the lower estimate in (3) by this elementary method. 
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ZEROS OF LEGENDRE POLYNOMIALS 


Part 1. APPLICATIONS OF STURM’S METHOD 


I. PRELIMINARIES 
1. With regard to the later applications it is advantageous to formulate 
Sturm’s theorem in the following form: 
Let f(x) and F(x) be continuous functions in a <x <b and let there f(x) < F(x) 
but f(x) AF (x). Let the functions y(x) and Y(x) satisfy in a<x<b* the differ- 
ential equations 


(1) + f(x)y = 0, Y” + F(«)¥Y =0, 
and further the following conditions: 

(2) y(x) >O ina<x<b, y(b) =0; 

(3) jim. { y"(x)¥(x) — y(x)¥'(x)} exists and = 0. 


Then either the function Y(x) is identically zero or it assumes negative values 
in some subintervals of (a, b). 


It may be observed that our equations are not necessarily satisfied for 
“=a. 

The essential idea of the proof is well known. Namely as a consequence 
of the assumption Y(x) =0, Y(x) 40 in a<x<b, we have 


b 


provided that a <x, <b and x,—a is sufficiently small. Here the positive num- 
ber K is independent of x;. Consequently we have [cf. (3) ] 


y’(b)¥(b) — y(6)¥'(b) = y’(b)¥ (6) > 0. 


Now, by (2), y’(b) <0, whence Y(b) <0, which is a contradiction. 
If the limit in (3) is <0, the statement is changed in an obvious way. If 
in addition to (2) the condition 
(3’) lim — y(2)¥"(z)} = 0 
is satisfied, Y(x) is either identically zero or it has at least one variation of 


sign ina<x<b. 

The same statement holds in the well known classical case in which condi- 
tions (2) and (3’) are replaced by 

* For x=) this means that the left-hand derivatives of the first and second order exist at x=) 


and satisfy differential equations (1). We write for brevity y’(b—0) =y’(d), y’’(b—0)=y’"(b) and so 
on. 
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(4) y(x4) >Oina<x<b, y(a) = y(b) = 0, 
and both differential equations (1) are satisfied in the closed interval 
a<x<b. In this case Y(x) 20, V(x) 40, a<x<b, implies 
y’(b)Y(b) — y’(a)¥(a) > 0. 
This is a contradiction because 
y'(a) > 0, y'(b) < 0, Y(a) 20, V(b) 20. 

2. We mention in this connection the following important application: 

Let o(x) be continuous and decreasing in x»<x<Xo, and let y be a 
solution of 
(5) y” + o(x)y = 0 
which is not identically zero. The sequence of zeros of y is always convex, 1.€., 
the sequence of differences of consecutive zeros is increasing. 


This theorem also goes back to Sturm [9, p. 173; cf. also Hille 5]; it can 
be deduced by means of the following simple argument. Let p<q<r<s< --- 
be the zeros in question,* g—=h. We apply Sturm’s theorem in the interval 
(q, r) to the equations 


y’ + o(x)y = 0, + ¢o(x — = 0, 


the second having the solution Y (x) = y(x—h) ; itis evident that ¢(x) <¢(x—h), 
so that r—q>h, i.e., r—q>q-—p. 

Remarks. This proof, and consequently the last inequality, remain valid 
under the following more general assumption: 


(6) o(x) > o(g) for x <q and ¢(x) < $(g) for x > q. 


Furthermore we can also have =<, in the sense that lim,..,;0y(x) =0, pro- 
vided that condition (3’), §I, is fulfilled for y(x) and Y(x) =y(x—h) at x=q. 
This means that 


lim {y'(x)y(« — h) — y(x)y'(x — h)} =0, h=q— x, 


x—q+0 


or, since the first term tends to zero and y(x)/(x—g) tends to a limit different 
from zero, 
(7) lim (x — q)y'(a — h) = lim (x — a)y/(x) = 0. 
x—q+0 +0 
3. In what follows we apply these theorems to the Legendre differential 
equation in the form 


* We suppose, of course, the existence of at least three zeros in the interval considered. 
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(A) 2” + {(m+ 3)? + (2sin@)-*}z=0, = (sin 6)"/*P,(cos 8), 
to the Bessel differential equation 

(B) + {1+ (20)-*}4=0, 

to the differential equation of the ultraspherical polynomials 

(C) of’ + {(m + + — w)(sin = 0, = (sin (cos 8), 
and to the general Bessel differential equation 

(D) ul’ + {1+ (1 — = 0, = 01/27, 


The equations (A) and (C) are satisfied in the open interval 0, 7; (B) 
and (D) are valid for @>0. 


II. LEGENDRE POLYNOMIALS 
1. We compare (A), z=Y, with the solution y=sin (w+ })(x—2) of 
(1) + (n+ = 0. 


This gives at once the existence of at least one zero of (sin 6)'/?P,,(cos @) in 
every interval of length 7/(z+4), especially in the intervals 


v—1 


Vv 
n+3 


[in the first interval condition (3’), §I, is satisfied ]. Consequently every in- 
terval contains exactly one zero 6,, and 
| 


v 
(2) 
n+3 


The lower estimate can be improved by means of the symmetric property 
n+1i-vp y—% 


— > — r= 


n+%3 


so that we obtain the inequalities of Bruns. 
2. We now prove first the upper estimate (3) of the Introduction. Since 
(2 sin 6)-* decreases in 0<@<7/2, Sturm’s theorem (§I, 2) asserts the con- 
vexity of the sequence of 6, [(5) of the Introduction]. For even values of 
the first remark in §I, 2 [cf. (6)] must be used in the interval 0,;2<0 <@n/241. 
Hence the convexity of the sequence 
v 


(3) (v= 0,1, 2,---, [n/2] + 1) 


| 
4 
4 
4 
=1,2,---,n) 
5 
(v= 1,2,---,). 
f 


6 GABRIEL SZEGO [January 


also follows. Now a convex sequence attains its maximum only at the end 
points of any interval, that is to say, always for the first or the last value of 
v. Since 0¢ =0 and, for n odd, On41/2 =0, this gives at once the upper bound 
for n odd. Let further ” be even; then 6/,2+0),/2::=0, so that there are only 
the possibilities 


<0, > 0; Onj2 > 0, <0; = = 0. 


Each of the last two cases is impossible: 6; =0, 6, 241 <0 implies 6,2 <0 and 
the third assumption would give, since 6¢ =0, that 6/ =0 for each 1, i.e., the 
identity of P,(cos 6) and sin (n+1)@/sin But 6;.2<0 yields 0/ <0, v=1, 
2,--.+,/2, therefore we have again the upper bound obtained above. 

We base the proof of the lower bound on the inequality 


(4) +3) (v= 1,2,---,; 0 = 0) 
which is a consequence of the comparison of (A) with (1). We put now 

us 
n+ 3 


according to (4) we have @/’ —6’_,<0,so that 6/’ is decreasing. For odd, 
6(.41)/2=0, consequently 6/’ >0, »<(n+1)/2. For n even it is sufficient to 
prove >0. This follows from (4), because 


— Onj2 = — Onj2 — Onj2 < + 3). 


(5) = 0, — 


Thus our theorem is completely proved. 
III. BESSEL FUNCTION Jo(0) AND LEGENDRE POLYNOMIALS AGAIN 


1. We compare (B) first with y’’+-y=0. This gives at once the existence 
of an infinite number of zeros of 01/2 J,(6) : 


(1) 
for which 
(2) je and j, < vr (v = 1, 2,3,---). 


Moreover the differences j7,—j,-1 are monotonically increasing, so that the 
sequence {j,} is convex. These facts are well known.* 
Using (2) we see that j,— v7 is decreasing, therefore 


je — wr 


whence 


* Cf. Sturm 9, pp. 174-175. 
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(3) jy + 0.014 
2. Now we compare (A), z= Y, with 
(B’) + 3)? + (20)*}y = 0, 


which has the solution y=6'/*J9[(n+4)0], 0<@<z [condition (3’), §I, at 
6=a=O is satisfied ]. Since sin 6 <6 we obtain the existence of at least one zero 
of P,,(cos @) in every interval 7,:/(n+4), j,/(w+3), v=1, 2, - - - , The fact 
that j,/(+3) <7, implies the existence of exactly one zero in each of these 
intervals, so that 


(4) jr—i/(n + 3) < 6, < j,/(m + 3) (v= 1,2,---,m). 


The upper estimate is particularly important. It is better than the upper 
estimate of Bruns. It yields at once a lower bound for j,. In fact, let v be any 
positive integer, and »=2v—1. Then v=(n+1)/2, 0,=2/2. We have there- 
fore 1/2 <j,/(2v—}4), that is, in view of (3),* 


(5) (v—i)r <j S (v = 1, 2,3,---). 


The upper estimate (4) for 0,=6,(m) is the best possible for fixed v, n>. 
Indeed it is known that 


(6) 


lim (n + = jy. 


3. The first inequality in (4) is not particularly sharp. To obtain a better 
one, we use the elementary inequalityT 


(sin 6)-? — 0-2? 1 —(2/r)? =k, 72/2, 
and compare (A), z=, with 
(B”) V" + t(m+ 3)? + k/4 + (26)-2}¥ = 0 
instead of (B’). Thus we obtain 
(7) 6, > jp/[(m + 4)? + k/4]2, 0 < 0, k/4 = 0.148678816--- . 


The same argument gives a sharper inequality for the zeros in the interval 
0 <6», where 2 is a fixed positive number, »<7z/2; we then obtain (7) with 
a constant k, instead of k, where k, = (sin v)-?—v-*. For example, 

thes = 0.094715264--- , 

tke = 0.088109344---. 


* Cf. Watson 11, p. 489-490; there it is shown that the positive zeros of J(6) lie in the intervals 
(v—4)x, (v—4)x (theorem of Schafheitlin). 
t The function on the left side is increasing in 0, 7/2. 
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Formula (7) gives also an upper bound for j,, putting as before n = 2v—1, 
vy=(n+1)/2,0,=7/2: 
ky k/4 
55] 


(8) 
32(v — 3 


This inequality for vy =1 is not as good as (3); it is much better, however, for 
v=2 and especially for large v. 


IV. ULTRASPHERICAL POLYNOMIALS AND GENERAL BESSEL FUNCTIONS 


1. We consider equation (C) in the “principal case” 0<y<1. The same 
argument as in the special case » =} (§II, 1) gives, with the same notation 
as was used in that case, 


n+ p n+p 


from which we obtain by means of the symmetric property 


(2) 


These are the inequalities corresponding to those of Bruns. 


2. We have further 
n+p n+1 


corresponding to the Markoff-Stieltjes inequalities. The proof is the same as 
in §II, 2; it is based on the convexity of the sequences 0,, and 
(4) (v= 0, 1, 2,---, [2/2] + 1), 
and on the inequality 
(5) 6, — < + p) (y= 1,2,---,m). 


3. In the “principal case” —}<\<} we obtain similar results for the 
zeros j,(A) of 6/27, (@) asin §III, 1. We have in particular 


(6) JAA) — < < or (v = 1, 2,3,--- ,jo(A) = 0), 
and 


* x>0. 


8 
| 
n+ n+p 
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(7) Sv — + 

We further compare (C) with (D) putting 1=\+3; we obtain [condition 

(3’), §I, is satisfied ] 

(8) /(m + < + 

The upper estimate here is the best possible in a sense analogous to §III, (6). 

We obtain from it as before 7/2 <j,(A)/(2v—1+4), so that* 

(9) (+A/2 — < < — fi) (< vr) ( = 1, 2, 3,---). 
Remarks. Comparison of (sin (cos @) with 6'/2J_,(6) [for \=0 with 


the Bessel function of the second kind Y (6) | is also possible. Condition (3), 
§I, is now satisfied only if \<0. We consequently obtain 


6, < jA— r)/(n + w) if < 0; < A)/(m + w) if AZO. 

On the other hand the ordinary form of Sturm’s theorem gives at once 
je(—X) >j(A) in the first case, j,4:1(—A) >7,(A) in the second, so that these 
bounds are less precise than (8). 

The lower estimate in (8), on the other hand, is always valid with —A 
in place of \. For negative \ this result is better than (8). 

4. We obtain a better lower bound of 6, than that given in (8) in a way 
similar to that of §III, 3: 
(10) 0, > j(d)/[(m + + — 7/2, 


where & has the same meaning as in §III, 3. Hence it follows as before that 


(11) < (v + A/2 — + (k/8)(F — A*) 


For large values of v, this bound is better than either the upper bound in (9) 
or that due to Schafheitlin, quoted in the last footnote. 


Part 2. TRIGONOMETRIC POLYNOMIALS 


V. DISTRIBUTION OF ZEROS 


1. Let Ao, Ai, Av, - - - , Am be non-negative numbers satisfying the ine- 
qualities | 


* Cf. Watson 11, pp. 490-491, where a theorem of Schafheitlin (in extended form) is stated: the 
zeros of J)(@) lie in the intervals 
(v+A/2 —3)r SOS = 1, }). 
The upper bound (9) is better than this, since 
RQ) 97, 
for Jy[(A/4+3)] <0 (cf. Watson 11, p. 491, ¢=*/8). 


‘a 
if 
| 
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(1) Ao > AL ZAm20. 
We shall prove that the trigonometric expressions 
p(t) = Ao cos mt + A, cos (m — 1)t + --- +Am-1 CoS? + An, 
q(t) = Ao sin mt + A, sin (m — 1)# +Am-isiné, 
r(t) = Ao cos (m + 4)t + Ax cos (m — + + Am_1 C08 + Am Cos 
s(t) = Xo sin (m + + Ar sin (m — $)t + --- + sin Am sin 


have only real and simple zeros which are regularly distributed in the follow- 
ing sense. Denoting by h, 2, t;, - - - the zeros in question in the interval 
0 <t<r, in increasing order, we have for p(¢), g(é), r(#), respectively 


(2) 


Besides these zeros in the interval 0 <¢ <7, there are, of course, the other zeros 
+t,+2hmr; moreover g(t) has the zeros hz, the zeros (2h+1)z, s(t) the 
zeros 2h. Here h is an arbitrary integer. All these zeros are simple. 

As another formulation, we have in the open intervals 


) ( ) 
m+ 34 


) ( 
Yi; T dy 
m+ 1 m+ i1 


(4) 


bol | 


respectively, exactly one zero of p(¢), q(t), r(é), s(é); runs here over all in- 
teger values except 

uw =0 (mod 2m+ 1), w= — 1,0 (mod 2m +1), 

uw =0 (mod 2m+ 2), w= — 1,0 (mod 2m + 2) 


(S) 


in the corresponding cases. In the second and fourth case we further have the 
trivial zeros ¢=0,+27,+47, --- ; all zeros (4) are “non-trivial” except in 
the intervals 1=m (mod 2m-+1) in the second, n=m-+1 (mod 2m+2) in 
the third case, containing the zeros t= +7,+3r, - - - 


(u = 1,2,---,m), 
m+} 
— 4 <t, <—— =1,2,---,m—1), 
m+ 
(3) 
ea (u = 1,2 m) 
m+1 m+ 1 
a+ 
m+ 
m 2 
= 
m+ 1 m-+ 1 
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2. The proof of the inequalities just formulated is very simple and can 
be based entirely on the classical fact that the trigonometric sums 


(6) om(t) = sin 3¢ + sin $+ --- + sin (m+ })t (m = 0, 1, 2, 3,---) 


are all non-negative for 0<‘<2z. This property is well known from Fejér’s 
summability theory of Fourier series, in which it plays a decisive role. i 
We have 


— H(t) ig(t)} =— tf + is(t)} 


(7) 


> Ax sin (k + 


k=0 


Partial summation shows at once that the last expression is non-negative for 
0<t<2rz. More precisely, it remains there decidedly positive because 


(8) Ax sin (k + 3)t = (Ao — Ardoo(t) + (Ar — As)oi(t) 
k=0 


(Am—1 - Am) om—1(#) + Amom(t) , 


the first term being positive. From this remark we deduce the important in- 
equalities 

p(t) sin (m + 3)t — g(t) cos (m + 3)t > 0, 
r(t) sin (m + 1)t — s(t) cos(m+1)t>0, O<#t< 2z, 


so that for the values of u mentioned in (3), 


sgn ) sgn ) sgn 7 ) 
= us 


= = 1 


and this gives our assertion. 
3. Under more restrictive conditions than (1), sharper inequalities can be 
stated. Let the coefficients satisfy, for example, the following conditions: 


(10) 


(11) — Ar > Ar — SZ A2 — S Am-1 — Am S Am S|} 


(This is always satisfied if the sequence Xo, Au, - - - , Am, 9, 0 is convex and not 
identically zero.) We then prove that ‘ 
+3 + 3 

(12) & T, T, T, 
m m m+ 3 m+ 4 


a 
} 
| 
q 
} 
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can be taken as upper bounds of t,, instead of those in (3) respectively. To 
show this, it is sufficient to prove that 


sgn p sgn = sgn +3 r) 
2 


sgn s = (— 1). 
( 


m+} 


(13) 


We have 
m—1 


k=0 k=0 


so that 


m 


k= 
0 


= do sin + D> (Ax — sin (& + + Am sin (m + 
(14) k=O 


= (2X0 — Ax) sin + (Ai — Az) sin + - 
+ (Am—1 — Am) sin (m — 4)t + Am sin (m + 4)E. 


The positivity of this trigonometric polynomial in 0, 7 (even in 0, 27) is a 
consequence of our condition, so that 


(15) p(t) cos mt + g(t) sin mt > 0, 0<t<2r. 


Similarly it is shown that 
(16) r(t) cos (m + 3)t + s(t) sin (m+ 3)t > 0, 0<t <2r. 


Thus our theorem is established. 
4. The reality of the zeros of 


(17) ap(t) + Bq(t), ar(t) + Bs(t) 


follows in the same way as in 2, provided that the inequalities (1) are satisfied; 
here a, 8 are arbitrary real constants not vanishing simultaneously. For these 
zeros inequalities similar to those given above hold; they are all simple. 

To prove this statement, put a+i8 =pe*(p >0, 6 real). We obtain from (9) 
for (m+3)t=(u—3)4+6, 0<t<2r, 


(— 1)*+*{ p(t) cos 6 + g(t) sin 5} > 0, 
so that 
(18) {sgn ap(t) + Bg(t)} = (— 
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We can assume 0 <6 <7, 5¥7/2. The last result gives immediately the exist- 
ence of at least one (consequently of exactly one) zero in each interval 
m+ m+ 4 
(u — $)r +8 (u— +6 
m+ m+ 
These are the zeros in the interval 0<¢<2rz. All zeros are real and simple, 
lying in the intervals (19), where uw runs over all integer values, 4#0 (mod 
2m+1). 
Trigonometric polynomials ar(#)+ s(t) can be treated in an analogous 
way with a similar result. 


=1,2,---,2m,if0 <8 < 


(19) 
< «. 


VI. ON A THEOREM OF POLYA 
The elementary inequalities of the preceding paragraph lead in a direct 
way to a theorem of Pélya [7], giving even a slightly more precise result. We 
consider the entire functions of z 


(1) U(z) = cos zxdx, V(z) = sin sx dx 


and we prove the following theorem: 


Let f(x) be non-negative, monotonically non-decreasing and not identically 
zero in 0<x<1; further, let the integral [ : f(x)dx exist. Let a and B denote 
real constants not both zero, a+iB=pe*, p>0, 0<édS7. The entire function 
aU (z)+8V(z) has only real and simple zeros; every interval 


(2) (u — +6, (u+3)r+6 (u=0,+1,+2,---), 


except that with z=0,* contains exactly one zero as inner point. 

The only exception is the case in which f(x) is a step function with jumps 
at the points of the form 1—2rh/|(u—4)x+5], h and p integers. In this case the 
zeros are also real and lie in the closed intervals (2). 


The proof is based (in a somewhat different form from that in the paper of 
Pélya 7, p. 361) on the trigonometric expressions 


U,.(2) = — >> (— ) cos 


M k=O 


(3) 


m + m + 


* For a=cos 5=0, i.e., for V(z) itself, we have two exceptional intervals, namely —7z, 0 and 0, x 
with the single simple zero z=0. 


M 


| 
d 

| 

{ 

| 

| 

| 

i 

| 

| 


14 GABRIEL SZEGO [January 


the symbol >~’ meaning that the highest term k=™m is to be multiplied by 
1+m or more generally by a factor >1 tending to 1 for mo. These ex- 
pressions tend respectively to U(z) and V(z) uniformly in an arbitrary finite 
region of the z-plane. The same is valid for aU,,(z) +8Vm(z). Hence we obtain, 
by means of the results of §V, 4, and of a well known theorem of Hurwitz 
(used also by Pélya), that all the zeros of aU(z)+V(z) lie in the closed in- 
tervals (2). 

Zeros in the inner part of these intervals are of course always simple. The 
only double zeros must have the form z=29 = (uo—3)7+6 (uo integer). Now 
we have* 


— Je-*{ U(z) + iV(z)} 
ger 
f — x) sin zx dx= fiw sin sxdx, 
0 0 


putting g(x) =f(1—x) for 0<x<1, g(x) =0 for x>1. The last integral can be 
written for z>0 in the form (cf. Pélya 7, p. 378) 


mlz 
(5) f sin 2x { g(x) — g(x + 2/2) + g(x + 20/2) — g(x + }dx, 


which is positive “in general” for all values of z, z>0. Consequently we ob- 
tain for =2 9 = +6 


(6) (- { U(zo) cos 6 + V(zo) sin 5} > 0, 


so that 2» is not a zero for U(z) cos 6+ V(z) sin 6. Incidentally, this argument 
yields at once the existence of at least one zero in the intervals (2). 

Zeros of the form z = Zo = (uo— 4) +6, Zo >0,7 can occur only if the integral 
(5) vanishes for z = Zo, that is, if 


(7) g(x) — g(x + = 0, g(x + 2/20) — g(x + = 0,---, 
0< < 
This means that g(x) =f(1—«) is a step function with jumps at the points 


Indeed under this assumption we have 


* The following argument corresponds to the treatment of the “algebraic” case given in §V, 2. 
+ We can assume 29>0 because (4) is an odd function. 
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1 1 
U(z) cos6 + V(z) siné = f f(x) cos (zx — 6)dx = f g(x) cos (2x — 2 + 6)dx 
0 0 


m—1 (k+1)d 
(8) => af cos (zx — z+ d)dx, 
k 


k=0 d 
Go go>0; d = m = [20/(27)]. 


Now the last sum is 


m1 sin (2(k + 1)d — 2 + 6) — sin (2kd — z + 8) 
ge 


k=0 
2 sin (zd/2) ==! 
(2(k + 4)d —2 +8). 
k=0 
Both factors have here the simple zero z = 2p. 
The theorem proved in §V, 3, gives similarly the results of Pélya for 
convex and increasing f(x). 


VII. TRIGONOMETRIC POLYNOMIALS OF THE LEGENDRE TYPE 
1. Let 


be a given sequence of positive numbers. The cosine polynomials 
aa if m is even, 
= COS + COS (M — 2)0 + + COS 
(2) = COS (n — 2k)O* 
k=0 


have been considered by Fejér [3]; Legendre polynomials P,(cos 6) are par- 
ticular cases for 


In what follows we use also the corresponding sine polynomials 


[n/2] 
(4) gn(0) = sin (n — 2k)0. 


k=0 


I have proved in a previous paper [10] that all zeros of f,(@) are real and 
simple provided that the sequence 


(5) a; /ao, a3/ae, On / On—1, 


* For n even, the last term v=/2 is to be multiplied by 3. 


4 
i] 
i 
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is monotonically increasing. Under this condition the coefficients of (2) are 
monotonically increasing. On putting 20=¢ and m=n/2 or m=(n—1)/2, 
f,(@) becomes of the type p(¢) or r(#) in §V, (2), respectively. The inequalities 
of §V, (3), give at once the following information about the zeros 0, of f,(@), 
0<0;<0.< 


(6) <¢.< 
n+1 n+i1 


2. The inequalities just obtained are not so precise as those of Bruns. 
By making some restrictions on the sequence {a,}, they can be improved. 
We show the possibility of deriving by this very elementary method the 
lower estimate not only in the theorem of Bruns but also in that of Markoff- 


Stieltjes. 


Let the sequence \a,} be of the form 


1 
(7) Ap -{ x"f(x)dx, 
0 


where f(x) is non-negative and integrable in the Lebesgue sense with ay>0. 


The sequence a, =g,, corresponding to the Legendre polynomials, is of 
this type [cf. (3) ]. 

As a consequence of our condition we first see that the sequence 
(8) An, AiAn—-1, » ApAn—p, [n/2], 
is positive, monotonically decreasing, and convex. Indeed Schwarz’s inequal- 
ity gives a? <ax1a%41, SO that a,/ax_1 is increasing. The convexity follows 
from the representation 


1 pl 
(9) = f f f(x) f(y)daxdy. 


We show further that 
1 


which is (in the case of Legendre polynomials) equivalent to the lower 
estimate in the Markoff-Stieltjes theorem.* 
To this end we use the following inequality due to Fejér [4]: 


qm/2 sin mt, 
— gm/2sin (m + 1)t, 


* We obtain immediately from (10) the existence of at least one zero in each of the intervals 
(v—4)x/n<0<(v+}4)x/n, v=1, 2, 3,- - -, p—1. An easy discussion shows then the existence of an 
additional zero in (op—}3)x/n<@<7/2. 


O<t<rn. 


(11) gi siné + g2sin 2¢+ --- + Qmsinmt > 


— 
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Here 4:1, g2,---, Ym is a positive, monotonically decreasing, and convex 
sequence. Fejér gives only the first inequality; the second arises from the first 
by considering the sequence qi, gz, - - - , Ym, Ym, Which has, of course, the same 
properties as qi, g2, Qm- 

Let be even. On putting 20=t, n/2=™m, in the first inequality (11), 
we see at once that 

[n/2] [n/2) 


k=0 k=0 


is positive in the interval 0 <@<7/2. We have therefore 


(13) sin n@ — g,(@) cos > 0, 7/2, 


so that (10) is valid. If m is odd, we put again 20=¢, (n—1)/2=m, and 
observe that the particular value =(2v—1)z/n satisfies the equation mt 
+(m+1)t=(2v—1)z, so that sin mt=sin(m+1)t. This means that the two 
expressions on the right side of (11) have opposite signs. The expression on the 
left side is therefore positive and this gives again (10). 


BIBLIOGRAPHY 


1. H. Bruns, Zur Theorie der Kugelfunktionen, Journal fiir Mathematik, vol. 90 (1881), pp. 322- 
328. 

2. L. Fejér, Sur les fonctions bornées et intégrables, Comptes Rendus, Paris, vol. 131 (1900), pp. 
984-987. 

3. L. Fejér, Abschdtzungen fiir die Legendreschen und verwandten Polynome, Mathematische 
Zeitschrift, vol. 24 (1925), pp. 285-298. 

4. L. Fejér, Einige Sdtze, die sich auf das Vorzeichen einer ganzen rationalen Funktion beziehen; 
nebst Anwendungen dieser Satze auf die Abschnitte und Abschnittsmiltelwerte von ebenen und rdéumlichen 
harmonischen Entwicklungen und von beschrankten Potenzreihen, Monatshefte fiir Mathematik und 
Physik, vol. 35 (1928), pp. 305-344. 

5. E. Hille, Uber die Nullstellen der Hermiteschen Polynome, Jahresbericht der Deutschen Mathe- 
matiker-Vereinigung, vol. 44 (1934), pp. 162-165. 

6. A. Markoff, Sur les racines de certaines équations (seconde note), Mathematische Annalen, 
vol. 27 (1886), pp. 177-182. 

7. G. Pélya, Uber die Nullsiellen gewisser ganzer Funktionen, Mathematische Zeitschrift, vol. 2 
(1918), pp. 352-383. 

8. T. J. Stieltjes, Sur les racines de l équation X,=0, Acta Mathematica, vol. 9 (1886), pp. 385- 
400. 

9. Ch. Sturm, Sur les équations différentielles linéaires du second ordre, Journal de Mathématiques, 
vol. 1 (1836), pp. 106-186 

10. G. Szegié, Bemerkungen zu einer Arbeit von Herrn Fejér tiber die Legendreschen Polynome, 
Mathematische Zeitschrift, vol. 25 (1926), pp. 172-187. 

11. G. N. Watson, A Treatise on the Theory of Bessel Functions, Cambridge, University Press, 
1922. 


WASHINGTON UNIVERSITY, 
Sr. Lous, Mo. 


| 

| 

i 
fi 


TRIGONOMETRISCHE REIHEN UND POTENZREIHEN 
MIT MEHRFACH MONOTONER 
KOEFFIZIENTENFOLGE* 


BY 
LEOPOLD FEJER 


EINLEITUNG 


1. In der vorliegenden Arbeit beschaftige ich mich mit Summen von der 
Form 


(1) > cm sin cmcosmd, >. 


d.h. mit trigonometrischen Reihen und Potenzreihen. Hier durchlaiuft m ent- 
weder die Gesamtheit der nichtnegativen ganzen Zahlen 0, 1, 2, 3,---, 
oder aber diese von einer bestimmten positiven ganzen Zahl n +1 angefangen, 
d.h. m=n+1, n+2, n+3,---. Im letzteren Falle nenne ich diese Reihen 
Restreihen, genauer trigonometrische Restreihen oder Potenzrestreihen. Eingeh- 
end werden auch die Fille untersucht, in denen m entweder nur alle nichtne- 
gativen geraden Zahlen, oder alle positiven ungeraden Zahlen durchlauft, 
bzw. diese nur von einer bestimmten angefangen. Schliesslich wird, wenn auch 
nur kurz, von Polynomen entsprechender Form die Rede sein, wo also die 
Koeffizienten c,, von einem gewissen Index an gleich Null sind. 

2. Die Grundvoraussetzung wird in der Regel die sein, dass die Koeffi- 
zientenfolge, welche die zugelassenen Indizes aufweist, und die ich immer 


mit 


bezeichnen kann, eine nichtnegative monotone Nullfolge sei,t und zwar ent- 
weder eine einfach monotone oder eine mehrfach monotone Nullfolge. Ein- 
fach monoton heisst eine aus nichtnegativen Gliedern bestehende Folge (2) 
dann, wenn auch ihre ersten Differenzen 


eine aus durchwegs nichtnegativen Gliedern bestehende unendliche Folge 
bilden; zweifach monoton, wenn ausserdem noch die unendliche Folge ihrer 


zweiten Differenzen 


* Presented to the Society, September 13, 1935; received by the editors February 1, 1935. 
+ Der Fall, in welchem alle c,=0 sind, sei indessen ein fiir allemal ausgeschlossen. 
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nichtnegativ ausfallt, u.s.w. Vollmonoton (E. Jacobsthal), oder auch total 
monoton (I. Schur) heisst eine Folge (2), wenn alle ihre Differenzenfolgen 
nichtnegativ ausfallen (d.h. A*c,=0 fiir »=0, 1, 2,3, ---;#=0,1,2,3,---). 

Auf dem Gebiete der mehrfach oder total monotonen Zahlenfolgen sind in 
der letzteren Zeit (1920-25) wichtige Begriffsbildungen und Resultate ent- 
standen, die hauptsichlich von E. Jacobsthal [16], K. Knopp [18, 19, 20, 
21], Th. Kaluza [17] und F. Hausdorff [11, 12] herriihren. Die héchst ele- 
mentaren hierher gehérigen Siatze, die in der vorliegender Arbeit gebraucht 
werden, sind, mit einigen Bemerkungen versehen, im §I zusammengestellt. 
Die Behandlung der Reihen (1) mit vollmonotonen Koeffizienten (2) méchte 
ich mir jedoch fiir eine spaitere Veréffentlichung vorbehalten. Bei vollmo- 
notonen c, kommt natiirlich der wichtige Satz von Hausdorff zur Anwendung, 
nach dem in diesem Falle die Koeffizienten c, gewisse Stieltjessche Momente 
endlichen Intervalles sind. 

3. Meine Beweismethode ist hauptsichlich auf die Positivitat der Partial- 
summen oder der iterierten Partialsummen (Partialsummen héherer Ord- 
nung) gewisser spezieller trigonometrischer Reihen gegriindet. Man findet 
dies in §II dargelegt. Ich bemerke, dass diese Positivitaitsbeweise sich in der 
Regel wieder einzig und allein auf die Nichtnegativitat der Partialsummen 
erster Ordnung der Reihe 


(5) 1+ 2cosé+ 2 cos 20+ 2cos 


fiir jedes reelle 6, oder, was dasselbe ist, auf die Nichtnegativitaét der end- 
lichen Summe 


(6) sin 6+ sin 36+ sin 56+ --- + sin + 1)6 


im Intervalle 0<@<z, und fiir n=0, 1, 2, 3,---, stiitzen. Nichinegativitat 
der Summe (6) und die Abelsche Umformung einer Summe von der Form 
a,b; +d2b2+ - - - +a,b, sind also die beiden Hauptstiitzpunkte meiner Beweis- 
fiihrungen. 

In den Paragraphen IV-VII iiber die Nullstellen der Summen °c, sin m6, 
> cm cos mO spielt aber noch ein drittes Beweiselement eine sehr wesentliche 
Rolle: Es ist dies eine originelle Schlussweise, mit deren Hilfe G. Szegé 
neuerdings die Nullstellen gewisser trigonometrischen Polynome in Grenzen 
eingeschlossen hat.* Uber diese méchte ich hier einige Worte sagen. 


* Herr Szegé hat mir seinen Satz iiber trigonometrische Polynome mit Beweis in einem Briefe 
vom 21. April 1934 freundlichst mitgeteilt. Vgl. seine hier unmittelbar vorangehende Arbeit [26]: 
Inequalities for the zeros of Legendre polynomials and related functions. Erwahnt sei noch, dass ich 
einiges iiber den Inhalt der vorliegenden Arbeit in drei Vorlesungen an einem Fortbildungskurs fiir 
Oberlehrer der Mathematik an der Budapester Universitat im Juni und Juli 1934 vorgetragen habe. 
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4. Die Szegésche Schlussweise stiitzt sich ebenfalls auf die obige Un- 
gleichung 


(7) > sin(2v +1020, 
v=0 
oder, was dasselbe ist, auf 


(8) > sin (v + = 0, 0 <6 <2r. 
v=0 

Bevor ich nun zur eigentlichen Szegéschen Schlussweise iibergehe, méchte 
ich, einleitend, aus der Ungleichung (8) den allereinfachsten Schluss ver- 
wandten Charakters ziehen. Es seien Co, ¢1, C2, - - - , Cx positive Zahlen, die 
monoton fallen, d.h. ¢o>c:>¢2>--- >c,>0. Durch Anwendung der ge- 
wohnlichen Abelschen Umformung folgert man dann aus (8) unmittelbar, 
dass 


(9) 2r, 


v=0 


oder 


0 
(10) ( sin —+( cos >0, 2r. 


v=0 


v=0 


Aus der Ungleichung (10) folgt aber augenscheinlich, dass die Polynome 
>" cos vO und sin v6 fiir 0<0<2z7 nicht gleichzeitig verschwinden 
kénnen. Da dies auch fiir 6 =0 der Fall ist, so ist also erwiesen, dass die beiden 
trigonometrischen Polynome fiir kein reelles @ gleichzeitig verschwinden 
kénnen. Da endlich mit ¢o, - - - , ¢, auch Gir, , Car®, OS r<1, mono- 
ton fallend ist, so haben wir das Resultat: Das Polynom 


(11) 


der komplexen Verinderlichen z hat keine Nullstelle fiir |z| <1, wenn 
>c,>0. Dies ist der bekannte Enestrém-Kakeyasche Satz. 
Der iibliche Beweis dieses Satzes [durch Multiplikation mit (1—z) ] ist natiir- 
lich auch héchst einfach; ich wollte hier nur zeigen, dass er auch eine un- 
mittelbare Folge der Ungleichung (7) ist, d.h. eine Folge der Tatsache, dass 
die Summe der Sinusse der ersten m ungeraden Multipla von @ im Intervalle 
0<0@<rz stets nichtnegativ bleibt, so wie der Sinus des ersten Multiplums, 
d.h. sin 6 selber. 

Szegé benutzt diese Tatsache nicat zum Beweis der Nichtexistenz einer 
gemeinsamen Nullstelle von gewissen Paaren von konjugierten trigono- 
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metrischen Polynomen, sondern im Gegenteil: Er beweist die Existenz reeller 
Nullstellen gewisser trigonometrischen Polynome 


(12) T(@) = > Cy cos vO, 


v=0 


und schliesst diese in Grenzen ein, indem er in der folgenden Weise vorgeht. 
Man fiihre in (12) —» statt v als Summationsindex ein. Dann folgt 


T(6) = > COS (n — = > cos ((m + 4) — (v + 3))0 


v=0 v=0 


(13) = ( Cn—» COS (v + cos (n + 


v=0 


v=0 


+( >> sin (v + 10) sin (n + 3)@, 


so dass also, wenn die Wurzeln der Gleichung cos (1+3)@=0 mit 


(14) 0, = (k + 3) 


(k = 0,1, 2, 3,--- , 2n) 


1 
2 


bezeichnet werden, die Gleichung (13) mit einem Schlage 


(15) T (6%) = B(Ox)(— 1)* (k = 0,1, 2,---, 2m) 


liefert. Hier bezeichnet 


(16) B(@) = > Cn—» Sin (v + 4)0. 


v=0 


Ist nun <Cn, SO ist, wie wir soeben bemerkt haben, 


B@) >0 fiir 0 <6 <2r, so dass also aus (15) 
(17) sgn T(0x) = (— 1)* (k = 0,1, 2,---, 2m) 
folgt. Hieraus ergibt sich nach Szegé unmittelbar der folgende merkwiirdige 


Satz: Ist im Kosinuspolynome ¢o+¢; cos0+ --- cos n6 


(18) O<a<a<--: 


so hat dieses, im Intervalle 0 <6 < 27, 2m voneinander verschiedene Nullstellen : 
ti, to, ts, - - - , ten, und es gilt 


us 
<te< (kR+ 34 k = 1, 2,3,---, 2m). 


Ein ahnlicher Satz besteht fiir c; sin sin 20+ - - -+c, sin n0. 


(19) (k — 3) 


i] 
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Dass ein Kosinuspolynom (oder Sinuspolynom) mit positiven, monoton 
wachsenden Koeffizienten lauter voneinander verschiedene reelle Nullstellen 
besitzt, hat schon G. Pélya [22] gefunden, und aus dem Enestrém-Kakeya- 
schen Satze abgeleitet. Wir sehen, dass Szegé durch Benutzung von sin 6 
+ sin 30+ --- + sin (2n+1)@20, bis zu einer von den Koeffizienten des 
Kosinuspolynoms unabhingigen Abgrenzung (19) aller Nullstellen vor- 
dringen konnte.* 

Die Szegische Schlussweise in der dargelegten Form ist es nun, die ich in 
der vorliegenden Arbeit immer benutze, wenn ich die Nullstellen von trigo- 


nometrischen Restreihen untersuche. 
5. Ich méchte noch in dieser Einleitung einige Resultate der vorliegenden 


Arbeit hervorheben. 
Man betrachte die Reihen 


(20) f(@) = co + cos @ + ce cos 20 + c3cos 30+ --- +c, cosnO+---, 
(21) = c, sin @ + ce sin 20+ cz sin 30+ , 
(22) (0) = cos + ce cos 306+ cos 50+ 
(23) = c, sin + ce sin 36+ cgsin 50+ , 


* Pélya leitet aus dem Enestrém-Kakeyaschen Satze mit Hilfe des Argumentsatzes der Funk- 
tionentheorie den allgemeineren Satz ab: 
cota cos nd = 0 


hat 2m voneinander mod 27 verschiedene reelle Wurzeln, falls die Wurzeln der algebraischen Glei- 


chung 
=0 
im Innern des Einheitskreises | | <1 liegen (auch fiir das Sinuspolynom giiltig). Uber die Begrenzung 
(Verteilung) der ausnahmslos reellen Wurzeln des Kosinuspolynoms auf dem Einheitskreise sagt aber 
Pélya nichts aus. Durch geringe Modifikation der urspriinglichen Pélyaschen Beweisfiihrung kann 
man indessen, wie ich bemerkt habe, auch in diesem allgemeineren Falle iiber die Verteilung der 
Wurzeln, im Szegischen Sinne, wenigstens so viel feststellen, wie es im folgenden Satze enthalten ist: 
Zerlegt man die Peripherie des Einheitskreises in n gleiche Teile, so liegt an jedem Teilbogen minde- 
stens eine Nullstelle des Kosinuspolynoms co+c, cos 0+ - ++ +¢, cos nO, falls die Wurzeln der alge- 
braischen Gleichung cotciz+ ++ + +¢n2" alle im Innern des Einheitskreises liegen. (Dasselbe ist fiir 
das Sinuspolynom ¢ sin 0+: sin 20+ -- - +c, sin 6 giiltig.) Der Grenzfall 
6 n@ 6. né 
= Jn n Seu am 
(1 + 2)znci? = 2" cos 3 cos + i2" cos 2 sin 9 
zeigt, dass dieser Satz sich insofern nicht verbessern lasst, als auf einem Bogen des Einheitskreises, 
der kiirser als 2x/n ist, keine Nullstelle des Kosinus- oder Sinuspolynoms zu liegen braucht. Weiter 
ist giiltig: Auf jedem Bogen von der Linge z/n des Einheitskreises hat entweder das Kosinuspolynom, 
oder das konjugierte Sinuspolynom eine Nullstelle. Bei dem Beweise dieser Sitze spielt die einfache 
geometrische Tatsache eine Rolle, dass ein Bogen des Einheitskreises von der Linge / (JS) von 
keiner inneren Stelle der Einheitskreisfliche unter einem kleineren Winkel als //2 sichtbar ist. 
Ubrigens gedenke ich noch in einer anderen Veriffentlichung auf die Saitze von Pélya und Szegé 
iiber trigonometrische Polynome zuriickzukehren. 
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(24) F(z) =coteiz+ co? +---, 
(25) G(z) = + + +---, 


wo die Koeffizientenfolgen immer nichtnegative, aber niemals identisch 
verschwindende Nullfolgen sind. 

Ist die nichtnegative Nullfolge co, ci, c2, - - - zweifach monoton (es geniigt 
2co statt co zu nehmen), so ist f(6) fiir jedes 0 nichtnegativ (§III). Ist c1, ce, 
¢3, -- - vierfach monoton, so ist f@) im Intervalle 0<@<z monoton fallend. 
Ist ci, C2, C3, - - - nur dreifach monoton, so braucht f(@) im Intervalle (0, 7) 
nicht monoton fallend zu sein (§III). Ist c, ce, cz, - - - zweifach monoton, 
so ist fiir 0<0@<z positiv (§III). Ist c, ce, cs, - - vierfach monoton, so 
ist g@) im Intervalle 7/2 monoton fallend. Ist cs, - - - dreifach 
monoton, so ist ¢@) in (0, 7/2) positiv, in (7/2, 7) negativ, u.s.w., es hat 
also das Vorzeichen von cos 6; d.h. sgn $(6) =sgn cos @ (§III). Weiter ist 
¢(6) auch monoton fallend in (0, 7). Ist 1, cz, cs, - - - einfach monoton, so ist 
sgn (0) =sgn sin 6 (§III). Ist co, c1, G2, - - - dreifach monoton, so ist fiir den 
Rest der Potenzreihe von F(z) die Ungleichung | ¢nyi2**2+c,422"t?+ - - -| 
<|F(z)| giiltig fiir |z| <1, s#1, »=0, 1, 2, 3,---, woraus |s,(z)| =|co 
+oz+ --- +¢q2"| $2|F(z)| folgt, |z| <1, n=0, 1, 2, 3, - - - (§VIII). 
Ist ¢1, C2, ¢3,--- Vvierfach monoton, so ist die Funktion F(z) schlicht fiir 
|z| <1 (§IX). Ist ci, ce, cs, - - - vierfach monoton, so ist jeder Rest von G(z) 
absolut <|G(z)| fiir |z| <1, 2¥1; ist c1, cz, cs, - - - dreifach monoton, so ist 
G(z) schlicht fiir | z| <1. 

Von meinen Sitzen iiber die Abgrenzung der Nullstellen von Restreihen 
sei hier nur der folgende angefiihrt: 

Ist die Koeffizientenfolge ¢o, ¢2,---, der trigonometrischen 
Reihe 


(26) h(@) = doc, sin (m+ 2» + 1)0 = cosin (m+ 3)0+--- 


v=0 


eine dreifach monotone Nullfolge, so hat die fiir 0<0<~7 stetige Funktion 
h() im Intervalle 0 <@ <7 mindestens Nullstellen 4, - - - , Fiir die im 
Intervalle 0<@<7/2 gelegenen Nullstellen gelten die Grenzen 


n+ 1 
27 k—3)—<h < k—— k=1,2,3,---,n'= 
[Ist 2 ungerade, so muss statt der letzten dieser Ungleichungen 


(28) (n’ —3)—=t,, 
n 


geschrieben werden. ] Fiir die iibrigen Nullstellen ¢,, die doch zu den obigen 
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i.B. auf +/2 symmetrisch liegen, gelten die zu den obigen i.B. auf 7/2 sym- 
metrischen Grenzen (§IV). 

A. Hurwitz bemerkt in seiner Arbeit [15] dass aus der blossen Tatsache, 
dass die unendliche Sinusreihe des mten Legendreschen Polynoms P,(cos 6) 
[die Heinesche Sinusreihe von P,,(cos@) | die Form 


(29) P,(cos0) = >> c,sin (m + 2v + 1)@ = cosin (n +1)0 +c; sin +3)0 + --- 
v=0 

hat, d.h. dass in ihr die Koeffizienten von sin 0, sin 20, - - - , sin 6 gleich Null 
sind, ohne weiteres die Existenz von m verschiedenen Nullstellen von 
P,,(cos@) im Intervalle (0, x) folgt. Er gewinnt dies aus der Verallgemeinerung 
eines Sturmschen Satzes beziiglich trigonometrischer Polynome, die mit 
co sin (n+1)@+ - - - anfangen, auf Reihen der gleichen Form. Ich fand nun, 
dass die Koeffizienten Co, ¢:, ¢2, - - - in der Heineschen Sinusreihe eine dreifach 
monotone (ja sogar eine vollmonotone) Folge bilden. Auf Grund meines obi- 
gen Satzes, und durch blosse Ansicht der Koeffizientenfolge der Heineschen 
Sinusreihe folgen somit fiir die Nullstellen von P,(cos @) die Schranken (27), 
d.h. die sog. Markoff-Stieltjesschen Nullstellenschranken. 

6. Es mag vielleicht Interesse finden, wenn ich in dieser Einleitung noch 
zeige, wie kurz und natiirlich bewiesen werden kann, dass die Koeffizienten- 
folge der Heineschen Sinusreihe von P,,(cos@) (eine sehr bemerkenswerte 
Entwicklung), vollmonoton ist. 

Ist 


4.3- 
(30) = => : 
v=0 


v=0 


5--- (27 — 1) 
2” 


b 


so wird bekanntlich 


n n 
(31) P, (cos 0) = >> = DY 
k=0 kent 


woraus 


(32) P,(cos @) = > COS (n — 2k)O, 


k=0 


so dass P,(cos #) ein Kosinuspolynom nter Ordnung ist. Suchen wir nun die 
Sinusreihe von P,(cos 


(33) 6; sin @ + sin 20+ --- +, sin 26+ sin(n+1)0+---. 


Ich behaupte zunichst, dass 
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(34) =0 
ist. Tatsichlich gilt nach Fourier 


b, =— f P,(cos @) sin = — f P,,(cos 0) — sin 6d0 
Jo sin 0 


(35) 


+1 

=— P,(x)U,_1(x)dx, 
—1 

wo U,_:(x) ein Polynom in x von (v—1)tem Grade bezeichnet, nimlich 
sin v6/sin 6 als Funktion von cos @=x. Wegen der Orthogonalititseigenschaft 
von P,(x) gilt also (34) tatsichlich. Da weiter P,(cos (r—6)) = P,(cos @) 
bzw. = —P,,(cos @) ist, je nachdem m gerade oder ungerade ist, so haben wir 
fiir die Sinusreihe von P,,(cos @) in beiden Fallen die Form 


P, 6 si 2 1)0 
(36) (cos sin (7 + 2v + 1) 


sin (n + sin (n + 3)0+--- 
erhalten. 

Um hier die Koeffizienten c? zu bestimmen, habe ich, mit Riicksicht auf 
(32), nur cos (n—2k)@ in eine Sinusreihe zu entwickeln. Da diese Funktion 
dieselbe Symmetrieeigenschaft wie P,(cos @) hat, so besitzt ihre Sinusreihe 
die Form 

cos (n — 2k)0=--- + B,sin (m+ + 1)0 
(37) y=0 
=---+osin (n+ Bi sin (n+ 3)0+---: ; 
die Koeffizienten von sin 0, sin 26, - - - , sin ”@ haben hier kein Interesse fiir 


uns, da sie, wie schon erwiesen, nach der in (32) geforderten Addition ohnehin 
wegfallen. Nun ist, mit Riicksicht auf (37), nach Fourier 


B, = = f — 2k)@) sin ((m + 2» + 1)0)d0 
(38) 


2 1 1 
-—( + ) (v = 0,1,2,--- 
r\2(n—k) + 2k+2v+1 
Also ist, mit Riicksicht auf (38), (37), (36) und (32), 


2 1 1 


2n—k)+2v+1 2k+2v+1 


T k=O 


Man hat aber 
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n 1 n 1 
40 


da die eine dieser Summen aus der anderen durch Ersetzen von k durch 
n—k hervorgeht. Also wird schliesslicht 
1 
2k+2v+1 
A2An—2 AnQo 


WO a,, ich wiederhole es, den Wert 
2-4---2k 


a, = 


bezeichnet. 

Aus dieser Form (41) der Heineschen Sinuskoeffizienten c* ist aber evi- 
dent, dass c¥*, ci*, - - - eine positive vollmonotone Nullfolge ist, da doch 
1/(2k+2v+1), v=0, 1, 2, --- , fiir jeden einzelnen der (n+1) Werte k=0, 
1,---,m, eine solche ist und a, a1, - - - ,@, positive Zahlen bezeichnen. 

I. User SUMMEN UND DIFFERENZEN HOHERER ORDNUNG 

1. Ist 


eine beliebige unendliche Reihe, so heissen die Folgen 


n n n 
(2) = si) = 52) = s,--- (n =0,1,2,3,---) 


v=0 v=0 v=0 


die Partialsummen Oter, iter, 2ter, - - - Ordnung der unendlichen Reihe (1). 
Sind die Glieder der Folge s“ der Partialsummen kter Ordnung einer unend- 
lichen Reihe (1) nichtnegativ, so sage ich, dass die Reihe von der kten Ord- 
nung nichtnegativ (kurz, kter Ordnung positiv) ist. Ist eine Reihe positiv 
von der kten Ordnung, so ist sie natiirlich auch positiv von der k’ten Ord- 
nung, falls k’>k. Die kleinste Ordnung k, nach welcher (1) positiv ist, heisse 
ihre wahre Ordnung. Diese kann fiir eine Reihe auch + sein. 

+ Aus dieser merkwiirdigen Form der c,¥* ergibt sich u.a. ohne weiteres die Form (6), (7) in §VI, 
da z.B. eine Partialbruchzerlegung die Gleichung 

n (2v + 2)(2v + 4) + + 2n) 
= (x/4)c* = = — 

kao 2R+Q2v+1 3) --* 2n — 1)(2¥+ 2n+ 1) 


liefert. 
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Ist der Konvergenzradius der Potenzreihe 
(3) F(r) = > wr” 
v==0 


positiv, so heisst die Funktion F(r) die erzeugende Funktion der Gliederfolge 
Uo, U1, U2, - ; dann stellt 


(4) — r)-*F(r) = > 


v=0 


die erzeugende Funktion der Folge der kten Partialsummen der Reihe (1) 
dar. Es ist also 


n+k n+k-—1 n+k—2 
Sn =( +( k uy + k U2 


k 
+ (n = 0,1, 2,3,---;# =0,1,2,3,---). 


(S) 


2. Es sei 
(6) 
eine beliebige unendliche Folge. Ferner sei 
= Un, 
= = Un — Vn41; 


= A(A = — + Onze, 


ats, = (7) 
n= V,) = 
0 1 
k 
+ (= (mn = 0,1, 2,3,---). 


Hier heisst A*v, die Differenz kter Ordnung mit dem Index n der Folge (6). 
Sind fiir eine Folge (6) bis zur Ordnung & inklusive alle Differenzen nicht- 
negativ, d.h. bestehen die Folgen A°v,, A'v,, A*v,, - - , 
aus lauter nichtnegativen Zahlen, so heisst die Folge (6) k-fach monoton. Ist 
dies fiir jede Ordnung & giiltig, so heisst (6) vollmonoton, oder auch total 
monoton. Ist die Folge (6) k-fach monoton bzw. vollmonoton, so ist auch 
jede ihrer Restfolgen 2,41, Yn42, Ynys,  - k-fach monoton bzw. vollmonoton. 
Sind beide Folgen v,, w, k-fach monoton, so gilt dasselbe auch fiir die Fol- 
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ZEN Vn+Wn, VnWn. Das letztere folgt aus der fiir jede Ordnung geltenden 
bekannten wichtigen Formel* 


t/k 
(8) A¥(t_,Wn) = ) 
v 


v=0 
Es ist also 
= UnWn, 
A?(dn Wn) = + 2(A + 
Wn) = wy + )A + 3(A*p)A + 


F(r) = v9” 
v=0 


die erzeugende Funktion der Folge vo, 7:, v2, - - - , und ist 
(11) (1 —r)*F(r) = + ts, 
so hat man 
(12) A*t, = (— 1)*Ante, 
d.h. (—1)*A*v, stimmt mit dem (~+)ten Koeffizienten in der Entwicklung 
von (1—r)*F(r) nach Potenzen von r iiberein (n, k=0, 1, 2,3, -- - ). 
Beispiel: 
_ plo + 1)(o + 2)--- +n— 1) 
1-2:3---n 
Hier ist F(r) = (1—r)~*, also (1—r)*F(r) =(1—r)-°-. Folglich gilt 


(1 — p)(2 — p)--- (k—p) plo + 1)--- — 1) 
(k+1)(k+2)---(k+n) 


(13) Vn 


Ato, = (— = 


Ist also 0<p<1, so hat man fiir die Folge (13): A*v,>0, wo n, k=0, 1, 2, 
3, -,d.h. sie ist vollmonoton. 
4. Recht niitzlich ist auch der folgende Satz. 


Es sei 


* Diese Formel ergibt A*(v,w,) =v,A*w, und natiirlich auch >w,A*tn, falls beide Folgen m, wn 
k-fach monoton sind. 


3. Ist 
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ein unendliches Produkt, in welchem die qi, Qn, nichtnegative 
Zahlen bezeichnen, die 1 nicht iiberschreiten. Ist dann die Folge qi, q2,-- - , 
Qn, - - - k-fach monoton, so ist die unendliche Folge der Teilprodukte 


(15) Ou 
wo 


(16) = a 4), 


mindestens (k+1)-fach monoton. 
Es ist gewiss A°O, =Q,, 20. Weiter ist 
(17) = Qn — = On(1 — (1 — 
d.h. 
(18) A'0n = 


Aus dieser Gleichung (18) erhalten wir durch sukzessive Differenzbildung, 
mit Hilfe der Formel (8), die folgende Kette von Gleichungen: 


QnA gn+1 + 
A*0,, QrA*gn+1 + + (A70n)gn+35 


k k 


Aus dieser Gleichungskette folgt aber der behauptete Satz unmittelbar. 


II. UBER DIE ORDNUNG DER POSITIVITAT DER SUMMEN VERSCHIEDENER 
ORDNUNG VON EINIGEN ELEMENTAREN TRIGONOMETRISCHEN 
REIHEN 


1. Setzt man z=re* in die Reihen 


1 


v=1 1 — 3? v=0 


so folgen durch Spaltung des reellen Teiles vom imaginiren vier Reihen, die 
zusammen mit ikren Ableitungen nach 6 folgendermassen lauten: 
1— 


(1) Fy(r) = =1+2)_r' cos 
1 — 2rcosé+ Pr? 
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(2) Fa(r) + 1)0 
r) = =) _ ?r’sin (v ‘ 
1—2rcos0+ pr? 


= = r’ COS 
1—2rcos26+ 7? j25 


1+ 7) sin@ 
=> r’sin (2v + 1)8, 
— 2rcos26+r? 


os — 2r + (cos 6)r? 
(1 2r cos r’(v + 1) cos (v + 1)8, 


(1 — 2r cos 6 + r?)? 


(1+ 7r)(1 — 2(3 — 2cos?@)r+r?) cos? 
7) F,(r) = = + 1) cos (2» + 18, 
(7) Fx(r) (1 — 2r cos 20+ + ) cos (2v + 1) 


(1—r)(1 + 2(1 + 
8) F = = *(2 1) sin (2 1)0. 
8) Fey) (1 — 2r cos 20 + r?)? + ) sin (2v + 1) 


(4) F,(r) = 1 


c 
(S) Fs(r) = 


2. Die von unserem Standpunkte aus “ungiinstigste” unter diesen Reihen 
ist F;(r). Fiir ihren Zahler 


fiir) = (1+ r)(1 — 2(3 — 2 cos? 6)r + 1?) cos 0 


gilt namlich f;(0) =cos 0, f;(1) = —8 sin? @ cos 0, so dass F;(r) fiir O<r<1 
sowohl positive als auch negative Werte annimmt, welchen Wert auch @ habe; 
nur die Werte 9=0, 7/2, 27/2, 32/2 sind Ausnahmen. Daraus folgt, dass 
(1—r)-*F;(r) sowohl positive als auch negative Koeffizienten haben muss, 
wenn @ beliebig, nur von 0, 7/2, 7, 37/2 verschieden ist, und zwar fiir jeden 
nichtnegativen ganzzahligen Wert von k. Die Reihe 


(9) cos@+ 3 cos 36+ 5cos 50+ --- + — 1) cos 1)0+--- 


ist also von keiner (noch so hohen) Ordnung nichtnegativ oder nichpositiv 
(ausgenommen 6=0, 2/2, 7, 37/2). 

Da, fs(r) =cos 0—2r+(cos 6)r? gesetzt, f5(0) =cos 0, fs(1) = —4 sin?(0/2), 
so ist die Reihe 


(10) cos 6 + 2 cos 20+ 3cos 30+ 


fiir —1/2<0<z/2 weder nichtpositiv, noch nichtnegativ kter Ordnung, wie 
gross auch die Ordnungszahl & sei (ausgenommen @=0). Ist aber 7/2<0 
<3n/2, so ist die Reihe (10), wie unten gezeigt wird, von der dritten Ordnung 
negativ. 

3. Was nun die iibrigen sechs trigonometrischen Reihen anbelangt, so 
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sind diese in geeigneten Intervallen alle nichtnegativ oder nichtpositiv von 
einer gewissen Ordnung. [Vgl. besonders Fejér 5, 7, 9, 10. | 
Zunichst ist die Reihe 


(11) 1+ 


fiir jeden reellen Wert von 6 nichtnegativ von der ersten Ordnung. In der 
Tat ist hier 


(12) = (n+ 1) + 2-2 cos @+ (m — 1)-2 cos 20+ ---+1-2 
Diese Summe s“)(6) ist gleich (sin(@/2))-*(sin [(#+1)0/2])?, also nichtnega- 
tiv fiir jedes* 0. Eine andere, von Toeplitz herriihrende, Darstellung, 
(wn + 1) + 2n cos 6 + 2(m — 1) cos 20+ --- + 2-1-cos n9 
n 2 n 2 
= ( Leos) sin v0) + <0 
(13) v=0 v=0 


1— 2 


1—2z 
setzt dies auch in Evidenz; wir sehen, dass s“’(6) immer positiv ist, und nur 
fiir die Argumente der (n+1)ten Einheitswurzeln verschwindet (die Wurzel 
z=1,d.h. 0=0, ausgenommen). Die Reihe (11) ist somit fiir jeden reellen 
Wert von 6 von der ersten Ordnung nichtnegativ. 

4. Merkwiirdig ist es nun, dass auf Grund der soeben bewiesenen Tat- 
sache, nach welcher die Koeffizienten der Potenzreihe von (1—r)~*F,(r) fiir 
jeden Wert von @ nichtnegativ sind, die Diskussion der iibrigen Reihen fast 
ohne Rechnung erfolgen kann. 

Wegen 
(14) F2(r) = (sin 6)(1 — 1?)-Fi(r) 


und da (1—r?)-!=1+r?+r+ - - - nichtnegative Koeffizienten besitzt, folgt 
sofort, dass die Reihe 


(15) sin@+ sin 206+ ---+sinn0+--- 


fiir 0<0<z entschieden positiv, fiir 7<@<2z entschieden negativ von der 
ersten Ordnung ist} (Satz von LukAcs). 

* Fiir n=1 ist (6)=(sin(@/2))-*(sin 6)?=4 cos? (0/2), also positiv in nur 0=z 
ausgenomen. 

t Es sei hervorgehoben, dass die s{” (@) der Reihe (15) fiir 0<@<- entschieden positiv sind. 
Da namlich 

(1—r) 2 = 14+ 4 cos? (6/2)r + 

so sind im Produkte die Koeffizienten vom geraden Index = 1, die vom ungeraden Index = 4cos*(@/2). 


(14) liefert endlich, dass die s, (6) der Reihe (15) fiir 0<@<zm die gemeinsame positive Minorante 
sin 6 cos? (6/2) haben. 


| 
i 
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Da weiter 
(16) F3(r) = (cos — r)(1 — 20), 
so ist 
(17) (1 — r)-*F3(r) = (cos @)(1 — r?)—"(1 — 28). 


Ich habe also erhalten, dass die Reihe 
(18) cos + cos 36+ cos 50+ 


fiir 0<@<7/2 entschieden positiv, fiir 7/2 <@<7 entschieden negativ von 
der zweiten Ordnung ist. 


Da 
(19) F,(r) = (sin — r)-'F,(r, 28), 
so ist es klar, dass die Reihe 
(20) sin 6 + sin 30 + sin 50+ --- 
fiir 0<@<7 von der nullten Ordnung nichtnegativ ist. 
Da 
(21) Fo(r) = (sin 6)(1 -- 
so ist 
(22) (1 — r)-*F,(r) = (sin 6)(1 — — (r))?. 
Dies liefert, dass die Reihe 
(23) sin 2 sin 20+ 3sin 39+ 


im Intervalle 0 <<@ <7 von der dritten Ordnung positiv ist. 
Bemerkung. Da 


(1 — = 1+ (= (0/2)r+---, 
sin (9/2) 
so ist 
((1 — r)-*Fi(r))? = 1 + 8 cos? (6/2)r+---. 
Durch Multiplikation dieser Reihe mit (1—r?)-'=1+r?+r'+ --- erhalte 


ich nun eine Potenzreihe, deren Koeffizienten aus lauter nichtnegativen Ad- 
denden zusammengesetzt sind. Ist der Index des Koeffizienten gerade, so ist 
die 1, ist er ungerade, so ist 8 cos? (6/2) einer dieser Addenden. Daraus folgt, 
mit Riicksicht auf (22), dass die Summen dritter Ordnung s“(@) der Reihe 
(23) im Intervalle 0 <0 <7 die gemeinsame Minorante sin 6 cos? (6/2) haben. 
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Da 
(24) Fs(r) = (1 — r) (sin + 2(1 + 2 cos? 6)r + r?)(1 — 26))?, 
so ist 


(25) (1 — r)*F3(r) = sin 0(1 + 2(1 + 2 cos? @)r + 7°) 
(1 — — 26))?, 


woraus sich die Positivitat vierter Ordnung der Reihe 
(26) sin 0+ 3 sin 30+ 5sin50+--- 


fiir 0 <6 <z ergibt. Hier ist jedoch 4 nicht die wahre Ordnung. Da die doppelt 
genommene Reihe (26) entsteht, indem wir in der Reihe (23) @ durch 7—0 
ersetzen und die so entstandene Reihe zu (23) addieren, so ist es klar, dass 
(26) fiir 0<@<z positiv von der dritten Ordnung ist. Ich habe bewiesen 
[Fejér 10], dass sie im Intervalle 0<0<z sogar von der sweiten Ordnung 
positiv ist (wenn man den Mittelpunkt 6=72/2 bei gerader Gliederanzahl 
ausnimmt). Diese Reduktion der Positivitatsordnung der Reihe (26) um eine 
weitere Einheit, ist mir aber nur durch die unmittelbare Betrachtung der 
Reihe (26), und nicht durch die hier benutzte Methode der erzeugenden 
Funktion gelungen. 


Da schliesslich 
(27) F;(r) = (cos — 2r + (cos 6)r?)(1 — r?)—*(Fi(r, 0))?, 
so ist 


(28) (1 — r)-*F;(r) = (cos@ — 2r + (cos @)r?)(1 — — 8))?. 
Fiir 7/2<0<z sind die Koeffizienten des Polynoms zweiten Grades in r: 
cos @—2r-+(cos 6)r? negativ. Ich habe also erhalten, dass die Reihe 

(29) cos 6+ 2 cos 26+ 3 cos 30+--- 

fiir r/2 <7 negativ von der dritten Ordnung ist. 


III. Positrv1tAT UND MONOTONITAT VON TRIGONOMETRISCHEN 
REIHEN MIT EINFACH ODER MEHRFACH MONOTONEN 
KOEFFIZIENTENFOLGEN 
1. Essei 
f(0) = ao/2 + cos 
(1) n=1 
= + a; cos 0 + ae cos 20+ +a, cosnO+--- 


eine unendliche trigonometrische Kosinusreihe mit nichtnegativen Koeffi- 
zienten. Ist ausserdem die Koeffizientenfolge monoton abnehmend und 
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lim,..a@, =0, dh. ist {an} eine einfach monotone Nullfolge, so ist die Reihe 
(1), nach dem Schlémilchschen Satze, fiir 0<0<2z iiberall konvergent, und 
in jedem Teilintervalle e<9<2r—e, e>0, gleichmissig konvergent. Die 
Summe der Reihe (1) sei fiir 0<@<2z mit f(@) bezeichnet; f(@) ist stetig an 
jeder inneren Stelle des Intervalles (0, 27). 


Ist nun die Folge {a,} sogar eine zweifach monotone Nullfolge, d.h. ist 


ao = 0, a 20,:--, 
(2) a —a,2 0, Qn — 


ay — 2a; + ae = 0, — 2ae + 0,- tn — + 2 0,---, 


so ist f(0) im ganzen Intervalle 0 <0 < 2x nichtnegativ. Ist die erste der Differen- 
zen zweiter Ordnung tatsichlich positiv, d.h. ist 


(3) a — 2a, + 


so ist (0) im ganzen Intervalle sogar positiv, und es gilt 


(4) 0) fur 0 <0 < 


Dass dieser Satz bei einfach monotonen a, nicht giiltig ist, zeigt das 
triviale Beispiel 


(5) f(0) = 4+ cos 0. 


Hier ist die Koeffizientenfolge 1, 1, 0, 0, 0, - - - einfach monoton, wihrend 
f(0) fiir 27/3 <0<42/3 negativ ausfiallt. 


2. Ist on, G2, a3, eime vierfach monotone Nullfolge, d.h. ist 
20, Alan = an — 20, A?an = tn — + 2 O, 


fiir n=1, 2,3,---, und lim,..a@,=0, so ist die Funktion f(0), welche durch 
die fiir 0<0<2zm konvergente Reihe 


(6) = ao/2 + cos 0+ ae cos 20+ +a, cosn0+--- 


definiert ist, monoton abnehmend im ganzen Intervalle 0<0<r. 


Beweis. Betrachten wir die fiir 0<r<1 gewiss konvergente Reihe 
(7) F(r, 0) = ao/2 + air cos 0+ aor? cos 20+ + 


sowie die Reihe 
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(8) = ayr-sin + aer?-2 sin 20+ --- + aqr*-nsinnO+---. 
Bezeichnen wir nun die Summen dritter Ordnung der Reihe 

(9) 2sin 20+ 

mit 

(10) si (0), s& (0), (0), , (0), 

so liefert eine viermalige Abelsche Umformung: 


ayr-sin 6 + a@or?-2 sin 20+ --- + aar™-n sin 


n—4 
= (ar? — + — + (0) 


y=] 


+ anr®: sp (8). 


(11) 


Die roheste Abschitzung bei der Reihe (9) liefert |s®(6)| <n; da ferner 
n’r™— 0, wenn ©, so folgt aus (11), durch den Grenziibergang n—~, 
d F(r, 6 
- = sin v0 
dé v=1 
(12) 


= (@)A*(a,r’). 
Da aber ai, a2, - - - vierfach monoton, und 7, 7’, - - - fiirO0<r<1 vollmono- 
ton ist, so ist ay, aer*, - - - sicher vierfach monoton. Die vierten Differenzen 
A‘(a,r’), v=1, 2, 3, - - - , sind also durchweg nichtnegativ. Die Summen drit- 
ter Ordnung s®(@) der Reihe (9) sind nun fiir 0 <@ <7 ebenfalls nichtnegativ, 
wie wir dies aus §II, Nr. 4, wissen. Also ist jedes Glied der konvergenten 
Reihe A*(a,r’) nichtnegativ. Wir haben also 
dF (r, 0) 
(13) —— <0 fur0<0<7r 

erhalten; d.h. die Funktion F(r, 6) ist, bei festem 0<r<1, im Intervalle 
0 <6<7 monoton fallend. Da schiesslich nach dem Abelschen Grenzwertsatze 


(14) f(0) = lim F(r,0), 0<0<-z, 
0 
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so ist auch f(6), als Grenzfunktion von monoton fallenden Funktionen, mono- 
ton fallend. Hiermit ist der angekiindigte Satz bewiesen.* 

1te Bemerkung. Wir haben aus der vierfachen Monotonie der Koeffi- 
zienten a, a, a3, - - - geschlossen, dass f(@) im Intervalle (0, 7) monoton 
fallend ist. Dieser Satz ist nicht richtig, wenn die Folge ai, ae, as, - - - nur 
dreifach monoton ist. Z.B. hat die Reihe 


(15) f{(9) = 3 cos 6 + cos 20 
die dreifach monotone Koeffizientenfolge 3, 1,0,0,0, - - - . Es ist aber 
f'(0) = — 3sin 6 — 2 sin 26 = — sin 0(3 + 4 cos 8), 
so dass also f(6) in einem Teile des Intervalles (0, 7) monoton wiichst. (Hinge- 
gen ist die Koeffizientenfolge in {(@) =4 cos 6+cos 26 vierfach monoton und 


es gilt tatsichlich f’(@) = —sin 0(4+4 cos fiir 0<@<z7.) 
2te Bemerkung. Aus dem obigen Satze folgt, dass die Gleichung 


> a, cos 9 = 0 


im Intervalle 0<@<z nur eine einzige Wurzel hat, falls {a,} eine vierfach 
monotone Nullfolge ist: Ist {a,} nur eine einfach monotone Nullfolge, so 


zeigt das triviale Beispiel 


sin (0/2) cos ((m + 1)0/2) 

sin (6/2) 
dass die Gleichung beliebig viele Wurzeln im Intervalle 0 <6 <7 haben kann. 
3te Bemerkung. Ist ai, a2, a3,--- vierfach monoton und a/2, a, 
Q2, @3,* ++ mindestens zweifach monoton [was stets der Fall ist, wenn 
ao = 2(2a1—az) |, so ist f(@), nach dem Vorhergehenden, fiir 0<@<z positiv 
und monoton fallend. Ist speziell schon ao, a1, @, - - - zweifach monoton, ja 


= cos@+ cos 26+ ---+cosn6=0 


* Wir setzen immer a;20, voraus, jedoch =0 
sei stets ausgeschlossen. Dann folgt aber, wegen a,—0, dass mindestens ein Index u existiert, so dass 
Ata, >0. Nun liefert (12) 


dF(r, 
= Su (O)A*(ayr’), 


und es ist nach Fussnote* auf S. 28 

At(ayr*) 2 (Ata,)r*, 
so dass 

dF(r, 8) 

dé 

Die rechte Seite dieser Ungleichung konvergiert fiir r—1 gegen (A‘a,)s{*) (6). Nun sind aber, wie 
wir in §II, Nr. 4, gesehen haben, die s,{*) (6) der Reihe (9) im Intervalle 0<@<z durchweg positiv; 
sie haben ferner ebenda eine gemeinsame positive Minorante. Daraus schliesst man leicht, dass die 
Grenzfunktion {(@) im Innern des Intervalles (0, +) sogar eine eigentlich fallende Funktion von 6 
sein muss [d.h. f(0;) >f(@2), wenn 0<0,;< 


(Atay)r#s, (8). 


— 
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sogar vierfach monoton, so ist also sicher > ."_,a@, cos v6 positiv und fallend im 
Intervalle 0<0<r. 


3. Sind die Koeffizienten B,, der trigonometrischen Sinusreihe 
(16) f(0) = Bi sin 0 + sin 20+--- +8, sinnO+--- 


zweifach monoton, so ist sie fiir 0<0<m konvergent und thre Summe f(6) fiir 
positiv. 


Ist die Folge {8,} sogar vierfach monoton, so ist f(@), auf Grund von §II, 
im Intervalle x/2 <0 <x monoton fallend. 

Beispiel. f(@) =4 sin 0+sin 20=sin 0(4+2 cos @) hat eine vierfach mono- 
tone Koeffizientenfolge. Man sieht, dass f(0)>0 fiir 0<@?<a. Wegen 
(0) =4 cos? 0+4 cos 6—2 ist tatsiichlich f’(0) <0 fiir Wir sehen 
aber daraus weiter, dass f(@) im Intervalle (0, 7/2) teils zunimmt teils 
abnimmt. 


4. Ist die Koeffizientenfolge der trigonometrischen Kosinusreihe 


(17) = >> a, cos (2v — 1) 0 


dreifach monoton, so ist ihre Summe fiir 0<0<2/2 positiv, fiir 
negativ, und im ganzen Intervalle 0 <0 <x monoton fallend.* 

Die Koeffizientenfolge der Reihe {(9) =2 cos 6+ cos 39 =cos 6(4 cos? @—1) 
ist zweifach monoton. Sie nimmt im Intervalle 0<@<~7/2 tatsichlich auch 
negative Werte an. Hingegen ist 3 cos 8+ cos 30 =4 cos? @ mit dreifach mono- 
toner Koeffizientenfolge positiv fiir 0<@<za/2 und monoton abnehmend fiir 
0<0<z. 


5. Ist die Koeffizientenfolge der trigonometrischen Sinusrethe 


(18) {(@ = > a, sin (2v — 1)0 


einfach monoton, so ist fiir 0<0<z nichtnegativ. 


Ist die erste der Differenzen ai1—a2, a2—as, - - positiv, d.h. ai—a2>0, 
so ist f(6) fiir sogar positiv und =(ai—az) sin 6. 


6. Es set 
(19) = > sin (v + 1) @ = co sin @ + c sin 20 + ce sin 30+ --- 
v=0 


eine beliebige Sinusreihe mit einfach monotoner Koeffizientenfolge} co, C1, C2, , 
lim, «Cn =0. Es bezeichne ferner a eine beliebig kleine, aber feste positive Zahl. 
Dann gibt es im Intervalle (0, a) Stellen, an denen f(0) positiv ist. 


* So wie ihr erstes Glied a cos 9. 
+ Die Reihe (19) braucht bekanntlich keine Fouriersche Reihe zu sein. 
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Beweis. Da 


= c, sin (v + = c sin ((v + 3)@ + 6/2) 
(20) v=0 
= ( > « sin (v + i) cos (6/2) + ( > «, cos (v + 10) sin (6/2), 


y=0 v=0 


2 sin (0/2)/(0) = A(@) cos (0/2) + B(@) sin (0/2), 


(c,-2 sin (vy + 3)0-sin (0/2)) 


v=0 


Co — — Gy) cos vO, 


> (c,-2 cos (v + 4)6@-sin (6/2)) 


v=0 


= (4-1 — ¢) sin 


v=1 


Aus (22) ersieht man, da {c,} eine monotone Nullfolge ist, dass 


(24) A(0) > oo — D (G1 — %) = 0 

v= 1 
fiir jeden Wert von @. Ferner sind cos (6/2) und sin (6/2) im Innern des Inter- 
valles (0, x) positiv. Kann ich also zeigen, dass 


(25) = — sin v0 

im Intervalle (0, a), wo 0<a<z, auch einen positiven Wert annimmt, so 
ist, mit Riicksicht auf (21), unser Satz bewiesen. [Ich habe fiir die Reihe (25) 
dieselbe Behauptung zu beweisen, die ich fiir die Reihe (19) aufgestellt habe. ] 
Fiir die Sinusreihe B(@) unter (23) besteht aber die Koeffizientenreihe 


(26) (¢,-1 — 


aus lauter nichtnegativen Gliedern und ist konvergent. Also ist die Sinus- 
reihe von B(@) fiir jedes 6 gleichmissig konvergent. Die Integration von 0 bis 
a liefert 


so ist 
(21) 
wo 
(22) 
und 
- 
(23) 


1936] REIHEN MIT MONOTONER KOEFFIZIENTENFOLGE 


0 
Wire nun stets B(6) <0 im Intervalle 0 <6 <a, so miisste 
— cos va 


(28) (G1 — <0 


sein. Bezeichnet » den ersten Index, fiir welchen in der Folge der nicht- 
negativen Differenzen 


(29) Co — C1, C1 — C2, C2 — C3, °° 
zum erstenmal eine tatsichlich positive auftritt, so miisste also 
(30) (cp-1 — Cp)(1 — cos pa) £ O 


sein. Dies ist aber sicher nicht der Fall, wenn 0<a<7/(4p), da dann die 
linke Seite von (30) positiv ausfallt.* 
7. Ubrigens ist der entsprechende Satz fiir die Kosinusreihe 


(31) f(0) = cos 0 + ce cos 26 + cz cos 
mit einfach monotoner Koeffizientenfolge 


(32) C1, C2, lime, = 0, 


ebenfalls richtig. Da nimlich {c,} einfach monoton ist, so ist bekanntlich 
sin 6 
(33) 2, % 
n=1 
fiir jedes 0 gleichmiassig konvergent. Es ist also, wenn 0<6@<a, 
sin na = sin onl 


(34) f(t)dt = Cn —- Da 


n=1 n=1 nN 
so dass fiir 0—>+0 
(35) = 
0 
Nun ist bekanntlich 


sina sin 2a 


(36) i + > 


* Ubrigens zeigen die Formeln (21), (22), (23) auch, dass (sin 6)f(0) fiir 0-++0 zu 0 strebt, 
wihrend f(6) fiir +0 auf eine nicht integrable Weise unendlich werden kann. 


39 
sin na 
n 
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fiir 0<a<z, und fiir jedes m. Wenn also {c,} eine nichtnegative einfach 
monotone Nullfolge bezeichnet, so ist auch }>*_,c,(sin na) /n positiv im Inter- 
valle 0<a<z. Also ist nach (35) 


(37) > 0. 
0 


Es ist somit unmdglich, dass fiir die Kosinusreihe f(@) unter (31) im Inter- 
valle 0<@<a bestindig /(6) <0 giiltig wire. Hier bezeichnet a eine positive 
Zahl. 

8. Mit Riicksicht auf die Anwendung auf die Sinusreihe des Legendre- 
schen Polynoms P,(cos @) erwihne ich den folgenden, dem vorigen ent- 
sprechenden Satz: 


Besitzt die Sinusrethe 


(38) f(@) = Doc sin (mn + 2v + 1)6 = cosin(m + 1)0+ cisin(m+3)0+--- 


v=0 
eine einfach monotone Koeffizientenfolge 
(39) Co, C1, C2)" ** 
so nimmt ihre Summe f(0) in jedem noch so kleinen Intervalle (0, a), wo a>0, 
positive Werte an. 


Der Beweis verliuft dem in Nr. 6 gegebenen ganz ahnlich. Er beruht 
jetzt auf der Gleichung 


(40) 2(sin 0)f(0) = A(@) cos n6 + sin nd, 
wo 
(41) A(@) =a > (¢y-1 — cos 


(42) = > (c,-1 — ¢,) sin 2v0. 


v= 1 


IV. User pre NULLSTELLEN VON TRIGONOMETRISCHEN RESTREIHEN 
MIT EINFACH ODER MEHRFACH MONOTONEN KOEFFIZIENTEN. 
HEINESCHER TYPUS 


1. Ein Polynom oder eine Reihe von der Form 
(1) COS (M+ 1)0 + 
(2) sin + 1)0 + bayesin (n+ 2)0+--- 


mége eine trigonometrische Restreihe heissen. In Anbetracht der Anwendung 
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auf die Legendreschen Polynome miéchte ich hier besonders ausfiihrlich die 
Sinusreihe behandeln, und auch diese nur im Falle, wo bloss der Sinus eines 
jeden zweiten Multiplums von 6 in der Reihe tatsichlich auftritt. Bei ver- 
anderter Bezeichnung handelt es sich also um die Reihe 


(3) (0) = Doc sin (m + 2v + 1)0 = cosin (m+ casin(n+3)0+---, 
v=0 
wo eine feste nichtnegative ganze Zahl bedeutet. Von einer trigonometri- 
schen Reihe von dieser Form sage ich, dass sie vom Heineschen Typus ist. 
Die Koeffizientenfolge co, c1, - - - , ¢,, Sei mindestens einfach mono- 
ton, ferner sei mindestens c)—c, >0. Weiter mége lim, ...c, =0 sein. Dann kon- 
vergiert die Reihe (3) fiir jeden Wert 0 <6 <7 und ist in jedem Teilintervalle 
e>0, gleichmissig konvergent. Ihre mit bezeichnete Summe 
ist also fiir jedes 0<0<rz stetig. 
Ich zerlege nun die Reihe, aihnlich wie es Szegé im Falle eines trigonome- 
trischen Polynoms tat, in zwei Summanden, und zwar hier in dreifacher 


Weise: 

(4) = p(@) sin (n — 1)8 + cos (n — 1)8, 
(5) = r(@) sin n6 + s(6) cos 

(6) = sin (m + + u(6) cos (nm + 1)6, 


wo 


(7) p(0) = > c, cos 2(v + 1)6, q(@) = sin 2(v + 1)8, 


(8) r(0) = > c, cos (2v + 1)6, s(0) = > c, sin (2v + 1)6, 


v=0 v=0 


(9) i(0) = > cos = > c, sin 


v=0 v=0 


Die sechs trigonometrischen Reihen (6), - - - , “(@) sind im Intervalle 
0<6<m iiberal] konvergent, im Intervalle ferner gleichmissig 
konvergent, so dass die sechs Funktionen (6), ---, (6) im Intervalle 
0<@<rz samtlich stetig sind. Je nachdem man die Wurzeln der sechs Glei- 


chungen 
sin (n — 1)0 = 0, cos (n — 1)0 = 0, 
sin n6 = 0, cos 9 = 0, 
sin (n+ 1)0 = 0, cos (n+ =0 


4 
| 
v=0 v=0 
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in Betracht zieht und eventuell die Koeffizientenfolge noch der Bedingung 
einer héheren Monotonie unterwirft, erhalt man verschiedene Aussagen iiber 
die Wurzeln der Gleichung f(@) =0 im Intervalle 0<@<7. Ich werde von 
diesen 6 Fallen nur die 3 behandeln, die ich fiir wichtiger halte. 

2. 1ter Fall: sin =0, d.h. 


Om = (m = 1,2,---,(m—1)). 


Das sind diejenigen Wurzeln von sin v@ =0, die in das Innere des Intervalles 
(0, +) fallen. Die Gleichung (5) liefert nun 


(10) L(Om) = S(Om) COS NOm = S(Om) COS mar = S(Am)(— 1)™. 


Da aber s(@) unter (8), wie in §III gezeigt wurde, im Intervalle 0<0<z7 
iiberall positiv ist, so ist 


(11) sgn f(4m) (— 1)™ (m = 1, 2,°°° ,(n— 1)). 


Ich habe also erhalten, dass f(@) in den (m—2) Intervallen (x/n, 2x/n), 
(2/n, 3r/n), - - ,((n—2)x/n, (n—1)2/n) je eine Nullstelle besitzt. Da aber 
f(0:) =f(a/n) negativ ist, und {(@), nach dem Satz in §III, Nr. 8, in beliebiger 
rechtsseitiger Umgebung von @=0 auch positive Werte annimmt, so muss es 
auch im Innern des ersten Intervalles (0, 7/n) eine Nullstelle besitzen. 
Daraus folgt, auf Grund der aus (3) resultierenden Symmetriegleichung 


(12) 


= — cos (m + 2v + 1)rsin (nm + 2v + 1)0 = (— 1)*f(9), 


dass f(@) auch im Innern des letzten Intervalles ((m—1)2/n, 7) eine Nullstelle 
hat. Es gilt somit der folgende Satz: 


Sind Co, C1, C2, - - + nicht negative, nicht zunehmende Zahlen mit lim,...c, =0, 
so ist die Reihe 


(13) f(@) = Doc, sin (m+ + 1)0 = cosin(m + 3)0+--- 

im Intervalle 0<0<z iiberall konvergent und ihre Summe f(0) ist fiir diese 

Werte von 0 eine stetige Funktion. Teilt man das Intervall (0, +) in n gleiche 

Teile, so besitzt f(0) in jedem dieser Teilintervalle mindestens eine Nullstelle 

(sogar eine Zeichenwechselstelle). D.h. f(0) hat im Intervalle (0, 7) mindestens n 

Nulistellen t,, te, - tn, so dass 


(14) (k—1)x/n<t < ka/n 


v=0 
™ 
(k = 1,2,---,m). 
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3. 2ter Fall: sin (7+1)@=0. Es sei also jetzt 


(15) = (m = 0,1,2, 


Wir erhalten durch Einsetzen in die Gleichung (6) 
(16) L(Om) = U(Omn) cos mx = u(O,)(— 1)”. 
Es sei nun {c,} eine zweifach monotone Nullfolge. Dann ist nach §III 


(17) = c, sin 2v0 

v=0 
positiv im Innern des Intervalles (0, 7/2). Nun sei erstens m gerade. Dann 
fallen 61, 02, - - - , 0.., wo n’=n/2, in das Innere von (0, 7/2), so dass also 


(18) sgn f(@m) = (— 1)” 


besteht. Dies besagt, dass f(@) im Innern jedes der Intervalle (6, 62), 
(02, 83), - - » On) je eine Nullstelle, insgesamt also Nullstellen 
besitzt. Da aber auch im Innern von (0, @;) eine Nullstelle vorhanden ist, 
so haben wir also bewiesen, dass {(@) im Innern von (0, 7/2) insgesamt n’, 
und infolge der Symmetrieeigenschaft (12) im Innern von (0, 7) 2n’=n 
voneinander verschiedene Nullstellen hat. Zweitens sei  ungerade. Dann 
haben wir wieder ’—1 Nullstellen im Innern von (0, 7/2), also 2n’—2 im 
Innern von (0, 7). Hierzu tritt jetzt noch die Nullstelle 0,,=72/2. Wir haben 
also im Innern von (0, 7) wieder 2n’—1=m Nullstellen. Ich habe also den 
folgenden Satz bewiesen: 


Sind die Zahlen {c,} zweifach monoton und lim,..c,=0, so hat die Summe 
f(0) der Reihe 


(19) (0) = > c, sin (n + 2v + 1)0 


im Intervalle 0<0<m mindestens n Nullstellen t, te, ---, tn (die zur Mitte 
6=2/2 symmetrisch gelegen sind). Es sind die Grenzen 


(20) 
nmi — m 
n+1 n+1 
giiltig, wobei m=1, 2,3, - - - ,n’=[(n+1)/2]. 
Ist n ungerade, so muss in der leizten Ungleichung ty <n'm/(n+-1) statt<das 
Gleichheitszeichen geschrieben werden. 


43 
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4. 3ter Fall: cos n0=0, also 
(21) 6,* = (2m — 1)x/(2n). 
Jetzt liefert die Gleichung (5) 
(22) S(Om*) = r(Om*) sin ((2m — 1)r/2) = r(On*)(— 


Es sei die Nullfolge {c,} dreifach monoton. Dann ist 


(23) r(@) = > c, cos (2v + 1)0 


v=0 


im Innern von (0, 7/2) positiv. Es ist also 
(24) sgn f(On') = (— 1)™*', 


solange 6,* im Innern von (0, 7/2) liegt. 
5. Betrachten wir nun nochmals die 6-Stellen des 2ten Falles: 


(25) 


Wir kénnen zunichst feststellen, dass 


1 
(26) <0, <0, < 0% <03< +--+ < w= | 


Diese Ungleichungskette ist fiir gerade m in der Form richtig, wie sie 
niedergeschrieben ist. Ist aber m ungerade, so muss das Ende statt On <0, 
so heissen: =0, =7/2. 

Da nach (24) sgn f(@n") =(—1)"*! und nach (18) sgn (6) =(—1)™ ist, 
so hat f(@) offenbar in jedem der m’=[(n+1)/2] Intervalle (6,*, @n), 
m=1,2,---,m’, je eine Nullstelle. Diese Feststellung ist im Falle eines un- 
geraden nur insofern zu modifizieren, als in diesem Falle fiir das letzte 
Paar 0,7 =0,=7/2 giiltig ist, also das Intervall (0,", 0,-) auf den Punkt 7/2 
zusammenschrumpft. Wegen (3) ist aber jetzt @=2/2 augenscheinlich eine 
Wurzel von f(@) =0. Ich habe also das folgende Theorem erhalten: 


Ist die Koeffizientenfolge {c,} der trigonometrischen Reihe 


(27) f(@) = > sin + 2» + 1)@ = cosin(m + 1)0+ csin(n+ 3)0+--- 


v=0 


eine dreifach monotone Nullfolge, so hat die fiir 0<@<z stetige Funktion (8) 
im Intervalle (0, mindestens n Nullstellen ty, tz, - - , tn. Fiir dieim0<0<7/2 
gelegenen Nullstellen gelten die Schranken 


44 
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(28) (m — 3) < tm < m—— 
n n+1 


Ist n ungerade, so muss statt der letzten dieser Ungleichungen 
(29) — 3) = ty = —— (= 2/2) 
n n+1 


geschrieben werden. 


V. UBER DIE NULLSTELLEN EINER BELIEBIGEN TRIGONOMETRISCHEN 
RESTREIHE MIT EINFACH ODER MEHRFACH MONOTONEN 
KOEFFIZIENTEN 


1. Es sei nun eine trigonometrische Sinusreihe von der Form 
(1) Sin (m + 1)0 + sin (m + 2)0 + bays sin (wn + 3)0+--- 


vorgegeben, in welcher alle ganzzahligen Vielfachen von 0, vom (n+1)ten 
angefangen, vorkommen. Mit der entsprechenden Kosinusreihe will ich mich 
auch jetzt nicht beschiftigen. Ich setze mit abgeinderter Bezeichnung 


(2) = c sin (n 1)0 

v=0 
und betrachte jetzt die Zerlegungen 
(3) = sin + cos 
(4) = sin + + cos (m + 3)6, 
(5) = E() sin (nm + 1)6 + cos (m + 1)6, 
wo 


(6) = c, cos (vy + 1)0, B(é) = c sin (v + 1)6, 


v=0 v=0 


(7) C(6) = > cr cos (v + 4)6, D(6) = > c sin (v + 3)6, 


y=0 y=0 


(8) E(6) = > cos vO, F(@) = sin vO. 
v=0 


2. 1ter Fall: sin (x+3)@=0, also 
2mr 
n+1 


Ist nun ¢o, G1, bloss einfach monoton, so sind C(@) und D(@) 


(9) On) = m 
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in 0<6@<2z konvergent und stetig und D(@) fiir 0 <6@<~7 gewiss positiv. Also 
liefert die Gleichung (4) 
(10) ) = ) cos mr 
und 
(11) sgn f(On') = (— 1)” 

3. 2ter Fall: sin (n+1)@=0, also 


(12) = m 
n+1 


Die Gleichung (5) liefert 
(13) ) = ) cos mr. 
Ist aber ¢1, C2, ¢3, Zweifach monoton,* so wird 


(14) = sin = sin @ + cp sin 20+--- 


v= () 
positiv fiir 0<@<z, also 
(15) sgn = (— 1)™ 
4. 3ter Fall: cos (n+3)0@=0, also 


(16) ~ 1) — 
an +1 


Ist Co, 1, - dreifach monoton, so wird 


C(é) = > c, cos (v + 4)0 
(17) v=0 


= co cos (0/2) + c, cos (30/2) + ce cos (50/2) + -- - 
fiir 0 <@<~7 positiv. Man hat also 
(18) = sin (2m — 1)x/2, 
d.h. 
(19) sgn = (— 1)™+! 
5. Es ist 
(20) < < 08) < 0%) < < 082 <--- < 0,8) < 6,02). 


* Man beachte die Unabhingigkeit von cp. 


n). 
1, 2, 
(m = 
1, 
n). 
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Dies folgt z.B. unmittelbar aus dem Umstande, dass die 0,,*) aufeinander- 
folgende Wurzeln von cos (n+3)0=0, die 0,{?) aber aufeinanderfolgende Wur- 
zeln von sin (n+1)6=0 sind, und dass 


(21) sin (w + 1)@ = sin (m + 4)@ cos (6/2) + cos (m + 3)0 sin (6/2). 


Ubrigens ist die Lange des Intervalles (6, , 6,{?)) 


— = m—— — (2m — 1) 
n+1 2n+1 
n—-m+1 


(n+ 1)(Qn+1)" 


-(1 
n+1/2n+1 


Die Linge dieser getrennten Intervalle nimmt also in arithmetischer Pro- 
gression ab, wenn der Reihe nach m=1, 2, - - - , m. Im Anfangspunkt 8,,{* 
ist das Vorzeichen von gleich (—1)+!, im Endpunkte gleich (—1)™. 
Ich kann also auf Grund des Vorhergehenden den folgenden Satz aus- 
sprechen: 


Bilden die Koeffiizienten der Reihe 
(23) S(O) = sin (m + + daze sin + 2)0+--- 


eine einfach monotone Nullfolge (d.h. eine gewdhnliche nichinegative Nullfolge), 
dann hat die im Intervalle 0 <0 <7 stetige Funktion (0) im Innern dieses Inter- 
valles mindestens n voneinander verschiedene Nullstellen, und zwar mindestens 
je eine im Innern der n Intervalle 


(24) (m = 1,2,---,m). 
Dieser Satz lasst sich insofern nicht verschiarfen, als eine Reihe von der 
Form (23) in besonderen Fallen genau m Nullstellen im Innern von (0, 7) 
haben kann. Dies zeigt das triviale Beispiel 


(25) 


= = = 0, 


d.h. die Funktion sin (n+1)6@. [Ihre samtlichen Nullstellen in (0, +) sind 
mr /(n+1), m=1, 2, - - - , m. Sie liegen einzeln in den Intervallen (24). ] 
Auf Sinuspolynome angewendet, lautet dieser Satz folgendermassen: 
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Ist die Koeffizientenfolge des Sinus polynoms 
(26) f(@) = sin (m + 1)0 + Cryo sin (n + 2)0+--- +c, sin 
monoton abnehmend und positiv, d.h. 
(27) Cati > > > Cp >O,—7 


so besitzt f(0) im Innern des Intervalles (0, +) mindestens n Nullstellen 
ti, lo, - - tn, fiir welche die Ungleichungen 


(28 m — 1) <th.< 
) ( + m 


n+3 n+3 


gelten. 


Hierzu und zum vorigen Satze iiber den Fall der unendlichen Reihe 
bemerke ich folgendes. Sturm hat bewiesen, dass ein Sinuspolynom (26) 
(Cn41#0, Cp>#0) im Innern des Intervalles (0, 7) héchstens p—1, und 
mindestens » Nullstellen hat, welche Werte auch seine Koeffizienten ¢n4:, 
Cn42, °° * » €p haben mogen. Hurwitz [15] hat den Satz von Sturm verallge- 
meinert, indem er bewies, dass auch die unendliche Sinusreihe (1) im Innern 
des Intervalles (0, +) mindestens m Nullstellen besitzt, vorausgesetzt, dass 
sie die Fouriersche Sinusreihe einer im Intervalle 0 <6 <7 beschrinkten und 
fiir 0<0 <7 stetigen Funktion ist. 

Unser Satz iiber das Sinuspolynom und iiber die unendliche Sinusreihe 
enthilt nun eine Prizisierung der Sitze von Sturm und Hurwitz in Szegé- 
scher Richtung, insofern, als er m getrennte Teilintervalle in (0, 7) mit 
mindestens je einer Nullstelle der endlichen oder unendlichen Reihe liefert, 
allerdings fiir den speziellen, jedoch nicht unwichtigen Fall, in welchem die 
Koeffizienten positiv und monoton abnehmend sind. (Was den Fall der un- 
endlichen Reihe anbelangt, so sei bemerkt, dass mein Satz z.B. auch die 
Reihe 
sin (m+ sin (w+ 2)0 
log (n + 1) log (n + 2) 


erfasst, die bekanntlich keine Fouriersche Reihe ist.) 
6. Auf Grund des Vorhergehenden gilt auch der folgende Satz: 


(29) 


Ist die Koeffizientenfolge der unendlichen trigonometrischen Reihe 


(30) S(O) = sin + 1)0 + daze sin + 2)0+--- 


eine dreifach monotone Nullfolge, so hat ihre fiir 0<0<7 stetige Summe f(@) 
in jeder der folgenden n, vollstindig getrennten, und von den Werten der Koeff- 
zienten unabhdngigen Intervalien 


48 


1936] REIHEN MIT MONOTONER KOEFFIZIENTENFOLGE 
(31) (2m 1), 
mindestens eine Nullstelle. 
(Die Linge des mten Intervalles ist 
n—-m+1 
(n+ 1)(Qn +1)" 


und die Summe der n Intervalle ist nz/(4n+2), so dass also das Verhiltnis 
der Gesamtlainge der Einschrinkungsintervalle (31) zu 7 gegen } konvergiert, 
wenn 1— ©.) 


(32) 


VI. Dre HEINESCHEN UNENDLICHEN TRIGONOMETRISCHEN REIHEN FUR DIE 
KUGELFUNKTIONEN. ANWENDUNG DER DARGELEGTEN ALLGEMEINEN 
RESULTATE AUF DIESEN SPEZIALFALL 


1. Heine hat fiir die Fouriersche Sinusreihe des mten Legendreschen 
Polynoms die folgende Form gefunden [Heine 13 Bd. I, S. 89, Stieltjes 24, 
Hobson 14]: 


S sin (n + 1)0 n+1 in (n + 3)0 
nN 

1-3 (m+ 1)(m + 2) 


1-2 (2n+ 3)(2n+5) 


(1/4)P,(cos 0) = 
(1) 


sin(n + 


d.h. 


( ) ( / ) ( ) — >» ( ) < 6 < 
3 5 (2n 1) v=0 


wo 


(3) 0 (v ) 


Fiihrt man die Koeffizienten a, ein, definiert durch 


1-3-5--- — 1) 
4 = 2"; =1, _= , 
(4) (1-2) eo ae 


so findet man, dass auch 
(1/4)an(2n + 1)P,(cos 8) 
(S) = (2m + 2)(2n + 4) -- (2m + 27) 
+ (n+ +1) 


sin + 2v +/1)8, 
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d.h. 


a, sin(m+ 1)0 


= sin (n + 2» + 1)0, 


v=0 
giiltig ist, wobei* 
on (2v + 2)(2v + 4) - (2v + 2n) 1 
+ 1)(2v + 3) + 2m —1) 2» + 2n +1 
(v = 0, 1, 2, 3,--- ).f 


Ich behaupte, dass die unendliche Koeffizientenfolge ¢o, 1, - - , Cy, 
vollmonoton ist. Tatsachlich besteht zunichst c,, laut (7), aus n+1, d.h. aus 
einer festen Anzahl von Faktoren. Irgend einer der ersten Faktoren hat die 
Form 


2v + 2k 1 
2v+2k—1 2v+2k—1 


(8) fH = 


wo k eine feste positive ganze Zahl, und zwar eine der Zahlen 1, 2, - - - , n, 
bezeichnet. Durchliuft nun v die Zahlen v=0, 1, 2, 3, - - - , so erhalten wir 
eine vollmonotone Folge fo , fi, fo, ---, fi, +--+. Dies folgt aus dem 
Umstande, dass die Folge 
1 

(9) wv + 2k —1 (v = 0, 1, 2, 3, ) 
vollmonoton ist. Endlich liefert auch der (n+1)te Faktorf{") =(2v4+2n-+1)" 
eine vollmonotone Folge, wenn v=0, 1, 2, 3, - - - gesetzt wird. Auf Grund 
von §I ist aber dann auch die Folge - - ffm ,v=0,1,2,3,---, 
vollmonoton. (Diese Tatsache liasst natiirlich auch noch verschiedene andere 
Beweise zu. Vgl. z.B. die Einleitung dieser Arbeit.) 

Auf Grund des Satzes §IV, Nr. 2, kann ich aus der blossen Tatsache, dass 
die Koeffizienten der Heinereihe fiir (7/4) P,,(cos@) positiv sind, und monoton 


* Z. B. lautet die Reihe fiir n=0 
sin (2v + 1)@ 
2>+ 1 


= 
4 » 


1 


t Beziiglich einer anderen Darstellung von c, vgl. die Einleitung dieser Arbeit. 
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abnehmen, was doch aus irgend einer von ihren Formen evident ist, den 
folgenden Satz aussprechen: 


Teilt man das Intervall (0, 3) in n gleiche Teile, so hat P,(cos 0) im Innern 
eines jeden Teiles eine Nullstelle. D.h. sind ty, te, - + + , tn die Nullstellen von 
P,,(cos 0) im Intervalle 0 <0 <r, so ist 


(10) (m —1)— < tn < m— 
n n 


Beriicksichtigt man die Tatsache, dass die Koeffizientenfolge der Heine- 
schen Sinusreihe dreifach monoton (ja sogar vollmonoton) ist, so liefert der 
Satz von §IV, Nr. 5: 


Die Nullstellen t, von P,(cos 0), welche im Intervalle 0 <0 < 7/2 liegen, sind 
in die Teilintervalle 


(11) Dan ( 1,2,3 
m—%4)—<tin<m m=1,2,3,---,n' = 
n+1 2 


eingeschlossen. 
Ist n ungerade, so muss stait der letzten dieser Ungleichungen 
us 
n n 2 
stehen. 
Dies ist die Markoff-Steiltjessche Einschrinkung der Nullstellen des mten 
Legendreschen Polynoms. 
2. Die konjugierte Kosinusreihe der Heineschen Reihe (6) stellt 30,(x) 
=430,(cos @), d.h. die Halfte der mten Kugelfunktion zweiter Art auf dem 
Querschnitte —1<*<-+1 dar. Es ist also 


(13) 30,(cos 0) = >> cos (n + 2v + 
Hier gibt die Zerlegung 


30,(cos = ( Cy COS 20) cos (nm + 1)@ 
(14) 


( sin 20) sin (n + 


v=0 


wegen > ,,-9¢, cos 270 >0 fiir 0<0 <7, dass 
(15) sgn Q,(cos 0m) = (— 1)” 
gilt, wenn 0,,=m7/(n+1), m=1, 2, ---,m, gesetzt wird. 
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Daraus folgt: Teilt man das Intervall (0, 7) in +1 gleiche Teile, so 
hat Q,(cos @) im Innern eines jeden Teilintervalles mindestens eine Nullstelle, 
also insgesamt mindestens »+1 Nullstellen im Innern des Intervalls (0, 7). 

3. Die Heinesche Reihe fiihrt auch zu anderen Eigenschaften der 
Legendreschen Polynome. So ergibt sich z.B. auch die sog. Stieltjessche 
Abschitzung mit einem Schlage aus der Form (5): 


(16) f(0) = (w/4)an(2n + 1)P,(cos 0) = >> a,B, sin (n + 2v + 1)8, 


wo 
(2m + 2)(2m +4) (2m + 2») 
(2m + 3)\(Qn + (2m + +1) 
= 1, 2,3,---). 


(17) 
Bo = 


Die Stieltjessche Abschaitzung des Legendreschen Polynoms hat aus der 
Heineschen Reihe schon Hobson [14] abgeleitet. Die folgende Ableitung 
diirfte einfacher sein. 

Da f(6) die fiir z=e genommene imaginire Komponente der Potenz- 
reihe 


y=0 v=0 


darstellt, so ist also sicher 


(19) | f(0)| < | F(e*)| =| 


2% 


v=0 z=e 


Da Bo=1, Bi, 2, - - - eine einfach monotone Nullfolge ist, so liefert die 
Abelsche Umformung 


(20) | f(8) | = (8, — (ao + +--+ + 0,2”) 


v=0 e249 


Nun gilt aber fiir die Abschnitte der Binomialreihe fiir (1—z)—/? [Fejér 6, 
Szegi 25] 
(21) | +--+ +a,2"| —2|-12 

(» = 0,1, 2,3,---;|s| 1,21), 


also ist 


= 
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(22) | 202 — = (sin 
d.h. 


(2n—1) 
(23) (2n + 1) | P,(cos @)| (sin 


(0<6<27," = 0,1, 2,3,---). 


Das ist die Stieltjessche Abschitzung fiir P,,(cos 0). Wir sehen, dass sie 
auch fiir die konjugierte Kosinusreihe der Heineschen Sinusreihe, d.h. fiir 
30,(cos 0) statt (1/4) P,(cos giiltig ist. 

4. Da die a,, v=0, 1, 2, - - - , von m unabhingig sind und eine monotone 
Nullfolge bilden, da ferner auch die Folge , 8, - - -, B§M, - - - eine 
monotone (Nullfolge) mit lim,.,6{” =1 bei beliebigem, aber festem, v ist, 
da schliesslich die Reihe _,a,2”, auf jedem Bogen e<0<7—e, e>0, 
des oberen Halbeinheitskreises, nach den Satzen von Schlémilch und Abel, 
gleichmissig gegen ihre Summe (1 —e?*®)—"/? konvergiert, so ist auch 


(24) tim ( ew) = (1 — 

= (2 sin 0)—'/? cos (4/4 — 0/2) + i-(2 sin 0)—'/? sin (1/4 — 0/2), 
und zwar gleichmissig auf jedem Bogen e<0<7—e, e>0. Somit liefert (18) 
(25) (m/4)en(2m + 1)P,(cos 0) = (2 sin @)-"/2{cos [(m + 3)0 — + 5,(0)}, 
(26) 4an(2 + 1)0,( cos = (2 sin cos [(m + 3)0 + + m(O)}, 
wo 


lim 6,(0) = lim 7,(0) = 0, 
und zwar gleichmissig fiir jedes Teilintervall e<9<a7—e, €>0. (25) ist die 
Laplacesche asymptotische Formel fiir P,(cos @), (26) die Heinesche fiir 
Q,(cos 0). [Siehe Heine 13, Bd. 1, S. 175.] 


VII. ASyYMPTOTISCHE EIGENSCHAFTEN DES RESTES EINER BELIEBIGEN 
FOURIERSCHEN SINUSREIHE. PRAZISIERUNG DES RESULTATES 
IM FALLE EINER EINFACH ODER MEHRFACH MONOTONEN 
KOEFFIZIENTENFOLGE 


1. Es sei f(@) eine beliebige, im Intervalle 0 < @ <7 definierte stetige Funk- 
tion, deren Ableitung daselbst ebenfalls stetig ist. Es sei 


(1) f(0) = sin + sin 26+ ---+6, sinnO+.--- 


die Fouriersche Sinusreihe dieser Funktion, d.h. 
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2 
2 = — i d. 
(2) J f( sin nt dt 


Mit Hilfe der Methode die ich in meiner Arbeit iiber den Rest der 
Fourierschen Reihe angegeben habe [Fejér 2, 3, 4] findet man leicht fiir den 
Rest 


(3) {(0) — s,(@) = > b, sin v9, 0<6<7, 


die folgende asymptotische Formel: 


1 cos + 
— = ———— 0) ——_——— 

sin (m + 
+ (— 1)*f(x + 

cos 30 
wo 0<@<7 und e,(@) im Intervalle S@<7—6 fiir beliebig kleines, aber 
festes positives 6 gleichmissig gegen 0 konvergiert.* Ich setze jetzt voraus, 
dass mindestens einer der beiden Werte f(+0), f(*—0), etwa f(+0), von 0 
verschieden sei. 
Wir betrachten nun die Wurzeln der Gleichung sin (7+ }3)0=0, d.h. 


(4) 


5 Om = 2m m = 1, 2,3,---) 
(5) ( » 2, 


von diesen jedoch nur diejenigen, die in das Intervall 6 <6 <7 —6 fallen, wo 6 
eine feste positive Grésse bezeichnet. Da aus (4) 


_ f(+9) 
(6) ({(0) — Sn(O))omo,, = (n + 3)x (sin (@m/2) 


folgt, so hat man, wegen der gleichmissigen Konvergenz von ¢€,(@) gegen 0, 
wenn nur n geniigend gross ist: 


(7) sgn (f(0) — sn(0))o-0,, = (— 1)” sgn f(+ 0). 
Wir haben also fiir den Rest 


cos mr + 


* Fir 
lautet die Formel 
n+1 n+2 2n+1 sin 30 


wo und zwar gleichmissig fiir 5>0. 


+ 
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= f(0) — sa(0) = >> sin vO 
p=n+1 
das folgende Resultat erhalten: 

Bewegt sich 6 nur im Intervalle 6 <6 <a —6, wo 6 eine beliebig kleine, aber 
feste positive Zahl bezeichnet, ist ferner der Index » entsprechend gross, so 
besitzt der Rest (3) der Fourierschen Sinusreihe der den obigen Bedingungen 
geniigenden beliebigen Funktion {(#) mindestens eine Wurzel zwischen je 
zwei aufeinanderfolgenden Wurzeln 


Om—1 = 2(m — 1 


= 2 


m 
2n+1 


der Gleichung sin (7+3)@=0, falls diese beiden Wurzeln in das Intervall 
§<6<7-—86 fallen. 

Wie kann dieser Satz vom Sturmschen Charakter prizisiert werden im 
Spezialfalle, wo die Koeffizientenfolge der Sinusreihe 


(8) b, sin + be sin 20+---+5,sinnO+--- 


eine einfach monotone Nullfolge ist? Insofern ist jetzt unsere Annahme 
wieder etwas allgemeiner, als es gar nicht nétig ist, dass die Reihe (8) eine 
Fouriersche sei. Die Zerlegung (4) in §V, Nr. 1, gibt uns ohne weiteres den 
folgenden Satz: 


Ist die Koeffizientenfolge der unendlichen Sinusreihe (8) eine beliebige ein- 
fach monotone Nullfolge, so hat der Rest 


(9) bagi Sin + 1)0 + daze sin (n + 2)0+--- 


mindestens n voneinander verschiedene Nullstellen (Zeichenwechselstellen) 
th, ts, , tx im Innern des Intervalles (0, 7), und zwar ist 


10 2(m — 1 < tm <2 


Dies gilt fiir jeden Wert von n. 


Mit anderen Worten: Was im Falle einer “beliebigen” Reihe (a) nur in 
einem Teilintervalle von (0, 7), und (b) nur fiir geniigend grosse Werte von 
n giiltig ist, ist im Falle monoton abnehmender Koeffizienten im ganzen In- 
tervalle (0, 7) und fiir jedes n giiltig. 
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VIII. ABSCHATZUNG DES RESTES UND DER PARTIALSUMME EINER 
POTENZREIHE MIT HILFE DER DARGESTELLTEN FUNKTION, 
WENN DIE KOEFFIZIENTENFOLGE EINFACH ODER 
MEHRFACH MONOTON IST 


1. Es sei 
eine beliebige dreifach monotone Nullfolge. Dann ist die Potenzreihe 

v=0 

fiir |z| <1, 21, konvergent und es gilt fiir die Restsumme R,(2) = 4162" 
(3) | R.(z)| S| fe)|, far |] S 1, 241, =0,1,2,---. 

Fiir den absoluten Betrag der Partialsummen s,(z) ist also 
(4) | sn(z)| =| cot +--- 2| f(z)| 
giiltig fiir | z| <1,2#1,n=0,1,2,---. 


Beweis. Da 


(S) R,(z) > 


v=0 


so ist fiir z=re”,O<r<i, 


v=0 v=0 


Da ¢n+9r"*’ bei festem r hinreichend stark gegen 0 konvergiert, wenn 
y— ,so ist die Abelsche Umformung zweiter Ordnung gestattet: 


(7) R,(re®) = ent (0) + io (8)), 

wo s{ (0) die Partialsummen erster Ordnung der Reihe 
(8) 1+ cos@+ cos 
o{ (0) die Partialsummen erster Ordnung der Reihe 
(9) 0+ sin@+ sin 20+ .---+sinn0+--- 
bezeichnen. [Es ist also 

(0) = (v+1)+vcos0+---+1- cos 9; 

o (0) = vsind + (v — 1) sin 20+--- +1- sin. 
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Da ferner die Folge {c,} als dreifach monoton vorausgesetzt wurde, so ist 
sie jedenfalls auch zweifach monoton, und die Folge {c,r’} ist, wegen r<1, 
ebenfalls zweifach monoton. Das gleiche trifft fiir die Restfolge der letzteren 
Folge, d.h. fiir 


(10) Cit”, 


zu. Weiter sind s,{” (@) und o{ (@) nichtnegativ fiir 0<@<z und v=0, 1, 
2, 3, -- - . Somit habe ich also gezeigt, dass in der unendlichen Summe auf 
der rechten Seite der Gleichung (7) [die den Wert e~i°R,(re) hat] jedes 
Glied eine komplexe Zahl mit nichtnegativer reeller und imaginirer Kom- 
ponente ist, 0<@<z. 

Die Folge {c,}, also auch {c,r*}, ist aber sogar dreifach monoton, so dass 
die zweiten Differenzen 


(11) A*(cor), A*(cyr'), A*(cer’), A*(cyr*), 


nicht nur nichtnegativ, sondern auch monoton abnehmend sind. Ersetze ich 
also in der Summe fiir (re) die Differenz A?(c,4,7"*’), fiir jedes vy, durch 
die Differenz A?(c,r’), so wird der absolute Betrag der unendlichen Summe 
nicht verkleinert, d.h. 


(12) | Ra(re®) | < | (8) + |. 

v=0 
Was aber jetzt rechts innerhalb des Zeichens des absoluten Betrages steht, 
ist f(re®) selbst [in Abelscher Umformung zweiter Ordnung ihrer Summen- 
form (2) ]. Also haben wir 
(13) | Ra(re#) | < | f(re*)|. 
Dies ist zunichst fiir 0<0<7m abgeleitet worden; wegen der Realitit der c, 
ist es jedoch fiir jedes reelle 6 giiltig. Hierbei wurde stets r<1 vorausgesetzt. 
Da nun die Reihen auch fiir z=e* iiberall konvergent 
sind (mit eventueller Ausnahme von z = 1, d.h. 8 =0), so habe ich also schliess- 
lich (3) bewiesen, woraus wegen s,(z) =f(z) —Ras:(z), wie schon bemerkt, (4) 
folgt. [Siche: M. Riesz 23, S. Chapman 1, L. Fejér 6, G. Szegé 25. Man vgl. 
das allgemeine Kriterium von Szegé mit dem meinigen. ] 


IX. User pie SCHLICHTHEIT VON POTENZREIHEN MIT MONOTONER 
KOEFFIZIENTENFOLGE VERSCHIEDENER ORDNUNG 


1. Ist in der Potenzreihe 


die Koeffizientenfolge {c,} vierfach monoton, so bildet sie das Innere des Ein- 
heitskreises |z| <1 schlicht auf die Funktionsebene ab. 
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Beweis. Es ist fiir r<1 


(2) f(re®) = u(r, 0) + iv(r, 6) = ( > ar’ cos ) + i( > or’ sin ). 
v=0 v=0 

Da {c,r’},v=1, 2, 3, - - - , gewiss zweifach monoton ist, so gilt, nach dem 
Satze in §III, Nr. 3, o(r, 0) >0 fiir O0<0<7. Da v=1, 2,3, ---, auch 
noch vierfach monoton ist, so ist u(r, 8) monoton fallend fiir 0<6<~z (§III, 
Nr. 2). Wenn also der Punkt re‘ bei festem r<1 den oberen Halbkreis re*, 
0<6<z, beschreibt, von 6=0 angefangen bis 6=7, so beschreibt der Bild- 
punkt u(r, @)+iv(r, @) eine, in der oberen Halbebene gelegene Kurve der 
w-Ebene, die keinen Doppelpunkt haben kann. Mit Riicksicht auf den Um- 
stand, dass das Kreisbild f(re®), 0<@<2z7, symmetrisch in Bezug auf die 
u-Achse ist, habe ich also bewiesen, dass jedem Kreise | s| =r, r<i, eine 
Jordankurve in der Funktionsebene entspricht, woraus unser Satz folgt. 

Weiter ist der folgende Satz giiltig. 


Ist in der Potenzreihe 
(3) f(z) = + coz? + +--+ + +.--- 


die Koeffizientenfolge {c,} dreifach monoton, so ist sie fiir |z| <1 konvergent 
und schlicht. 


2. Die Funktionen f(z) wie (1) und (3) bilden eine Unterklasse derjenigen 
im Ejinheitskreise schlichten Funktionen, die ich in meinen Noten [L. Fejér 
9, 10] betrachtet habe, und die allgemein dadurch charakterisiert werden 
kénnen, dass fiir sie zf’(z) im Innern der oberen Hilfte des Einheitskreises 
(z=re®, 0<r<1, 0<0@<z) eine imaginire Komponente bestindigen Vor- 
zeichens hat.* Auf Grund der Resultate meiner Noten kann ich also noch 
hinzufiigen, dass bei der Reihe (1) die arithmetischen Mittel dritter Ordnung 
der Partialsummen, bei der Reihe (3) die arithmetischen Mittel zweiter 
Ordnung der Partialsummen fiir |z| <1 und fiir »=0, 1, 2, 3, - - - , schlicht 
sind. 
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DIFFERENTIAL GEOMETRY OF A CERTAIN 
TYPE OF SURFACE IN S,* 


BY 
V. G. GROVE 


1. INTRODUCTION 


In this paper we shall study the sustaining surface of an orthogonal con- 
jugate net immersed in a space of four dimensions . We set up a defining sys- 
tem of partial differential equations. Associated with each point of the surface 
is a unique plane containing all of the normals to the surface. We define cer- 
tain unique normals and pairs of normals to the surface and characterize 
them geometrically. For this purpose we study the sustaining surfaces of the 
orthogonal projections of the given net on certain geometrically defined 
spaces of three dimensions. We call these surfaces normal projection surfaces. 
A normal determines a unique normal projection surface. Among the nor- 
mal projection surfaces there are two, one possessing maximum total curva- 
ture, the other minimum total curvature. The normals determining these par- 
ticular projection surfaces are perpendicular. We have called them the prin- 
cipal normals. An analogue is given of the well known theorem that if a line 
of curvature is a geodesic it is a plane curve. 

Let the curves of the given orthogonal conjugate net NV, be taken as the 
parametric curves. The non-homogeneous cartesian coordinates %2, x3, 
of the point « on the given surface S, will therefore satisfy the equations 


(1) Xun = bx, xx, =0. 

We shall call the plane containing all of the normals to S, at x the normal 
plane to S, at x. Select in the normal plane two perpendicular lines \ and yu 
with direction cosines (Ai, Az, As, A4) and (441, Me, Ms, Ms) Tespectively. It follows 
that the functions \ and p satisfy the equations 
Ax, = 0, Ax, = 0, ux, = 0, ux, = 0, 

> a? = 1, > = 1, > = 0. 
We see readily that the functions x, \, wu satisfy a system of differential equa- 
tions of the form 
(3) ’ = ax, + Bx, + DA + Do, 
= ax, + bx», 
= + + Di'x + D?' 


(2) 


* Presented to the Society, September 13, 1935; received by the editors April 15, 1935. 
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Au = bu = + Ard, 
Ay = + Buy, My = Mex, + Bod, 
wherein 
= 2E,/E, B= —}3E,/G, Di = dtw, Dz = 
= 3E£,/E, 6 = 1G./G, 
= — 1G,/E, 1G.,/G, =D Ax», = 
= — D,/E, ny = — D{'/G, — D,/E, Ne 
and wherein 
E=)> x2, G=)>> x?. 
The integrability conditions of system (3) are 


D,"” 
LE, + ) = Di, + BDz, 
D," 


G 


(5) 


) = Dis + 450%, 


G,(E./E + G,/G) + E,(E./E + G./G) 
2(Ev» + Guu) + 4(D,D{’ + D.Dz'), 


Ay = Bu, A, + Az = 0, B,+ B, = 0. 


2. POWER SERIES EXPANSIONS FOR THE SURFACE 
If we use the tangent lines to C,, and to C, and the lines \ and p for the axes 
of a local system of reference, we find that the coordinates of a point y with 
general coordinates (yu, ye, y3, ys) Will have local coordinates (£1, &, £3, &) de- 
fined by the expression 


fox, 
(8) yr et Gus 


Let y be a point on S, with curvilinear coordinates (u+Au, v+-Av),where 
(u, v) are the curvilinear coordinates of x. The coordinates of y are of the form 


(9) y= at x Au t+ + 2x,,Audv + x,,Av?) +---. 


If use be made of (3), (8), and (9) we find that the local coordinates of y 
are defined by the expressions 
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MX ovy 

Dz’ /G, 

D3’ 

+ = De + Bids 

= Dez D,"" ” 

16. = Dy+AWDi, } 

and | 
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+ 3(aAu? + 2aAudv + yAv?) + --- |, 
= G'/2[Av + 3(BAu? + 2bAuAd + SAv?) + - J, 
= 4(D,Au? + D,'’Av?) + 3[(aD, + Diy + AqD2)Au® + 3aD,Au*Av 
+ 3bD;'AuAv? + + Di, + BoDy')Av®] + , 
+ Df’ Av?) + + + AD )Au® + 3aD,Au7Av 
+ 3bD;'AuAv? + + Dz, + 
From (10) we find the following expansions in local coordinates: 
1 Dy + AcD2 — 20D, 
E3!2 
1 aD, 1 


2 EG'‘/2 2 GE}/2 


1 


- 


EP +---, 


1 1 bDi’ +7D2_., 
1 Di + — 


+ 

Equations (11) may be interpreted as follows. The first of equations (11) 
and &=0 are the equations of the sustaining surface of the orthogonal pro- 
jection of the given net on to the S; determined by the tangent plane and 
the normal X. A similar statement holds for the second of (11) and §:=0. We 
shall call these surfaces the normal projection surfaces of S, determined by 
and by yu respectively, and shall denote them by S, and S, respectively. 

From (11) we find that the principal radii of normal curvature of S, are 


(12) 


and the principal radii of normal curvature of S, are 

Dz 


The total curvature of S, and S, are respectively 
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(14) 
EG EG 


and the mean curvatures are respectively 

From (6) we observe that the sum of the total curvatures of the normal projec- 
tion surfaces determined by perpendicular normals is a constant at a point of the 
surface. 

3. A CANONICAL FORM OF THE DEFINING DIFFERENTIAL EQUATIONS 


Let us now make the transformation 


= Ad + Ba, 


(16) 
w=—B\+Ag, A2?+B?=1, 


on system (3). The transformation (16) is equivalent to a rotation of axes in 
the normal plane. Let the coefficients of the transformed differential equa- 
tions be denoted by 4, 8, - - - . We find that these new coefficients are given 
by the following formulas: 


AD, — BDz, = AD{’ — 


D, = BD, + Di’ = BD{' + 


my, = Am, Me Bm, Amz, 
= An, = Bn, + Ano, 
A,=A,+A,B—AB,, B,=B,+A,B—AB,, 


= Az + AB, — A,B, B. = B, + AB, — A,B. 


The total curvature K, of the surface of normal projection Sj is deter- 
mined by the expression 


EGK, = (AD, — BD,)(AD{’ — BD2'). 


This surface has maximum or minimum total curvature if and only if A and B 
satisfy the quadratic equation 


(18) LA? — 2MAB — LB? = 0, 
wherein 


(19) L=D,Di' + Di'’D2,  M = — 
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Since the product of the roots of (18) as a quadratic in A/B is minus 
one, the two normals determined by these roots are perpendicular. We shall 
call these normals the principal normals of S, at x, and the normal projection 
surfaces determined by the principal normals, the principal normal projec- 
tion surfaces. We may state our results in the following form: Through the 
point x there exist two normals with the property that the normal projection sur- 
faces determined by them have maximum and minimum total curvatures. 

Let us suppose that the transformation (16) with values of A and B de- 
termined by (18) has been effected on the system (3). The resulting differen- 
tial equations assume a canonical form in which 


(20) = ID,, Dg’ = 
that is, a form for which L=0. For this form the normals \ and yw are the 
principal normals. 

From (20) we observe that if one of the principal normal surfaces is iso- 
thermic, the other has the same property. 

4. OTHER UNIQUE NORMALS 

The general coordinates of the principal centers of normal curvature of 

the surface of normal projection Sj determined by 


= AX — Bu 


— By) G(Ad — Bu) 
AD, — BD, AD{’ — BD?’ 


The local coordinates of these points are 
AE 


(22) &=0, &=0, & 


and 
AG 
AD{' — BD{' 


(23) = 0, 0, 


— 


respectively. 

The locus of the centers of principal normal curvature for all normal projec- 
tion surfaces are therefore straight lines. 

The equations of these lines are 


= 0, = 0, + = E, 


(24) 
&, = 0, & = 0, Di'ts + Df’& =G. 


are 
=- 
AD, — BD, 
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We shall call these lines the central lines of S, at x. The central lines of S, at x 
are orthogonal if and only if 


(25) D,D{’ + = 0. 


Hence the central lines are perpendicular if and only if the total curvatures of 
the normal projection surfaces in two perpendicular directions differ only in sign. 
Moreover from the integrability condition (6) we observe that if the total 
curvatures of the normal projection surfaces of a surface sustaining an orthogonal 
conjugate net in perpendicular directions differ only in sign, all such normal 
projection surfaces have the same property. 

The tangent plane to S, at x and the osculating plane to the curve C, at x 
determine a space of three dimensions. This space intersects the normal plane 
in a line with direction cosines proportional to 


(26) Dir + Dep. 


A similar statement holds for the curve C, and the line through x with direc- 
tion cosines proportional to 


(27) + 


We shall call the lines through x and with directions defined by (26) and (27) 
the intersector normals of the curves C, and C, respectively. We see readily 
that the intersector normals are perpendicular if and only if the central lines 
are perpend ‘cular. 

If we define a geodesic as an extremal curve of the integral 


ff eu: + Gv’?)"/?dt, u’ = du/dt, v’ = dv/dt, 


we find that the differential equation of the geodesics on S, is 
(28) — = yv'? — (6 — 2a)u'v’? + (a — 2b)u’*v’ — 
Consider now the curve u= u(t), v=v(#) on S,. We may show very readily 
. that the equations in local coordinates of the osculating plane of the curve are 
+ Di‘ — + Di’ + Jés = 0, 
G"!?y'(Deu’? + v’?)E, — + + Jk, = 0, 
wherein 
J = u'v" — — + (6 — 2a)u'v’? — (a — + 
Hence the osculating plane at the point of a curve on S, intersects the normal 


plane to the surface at the point of the curve in a line if and only if the curve is a 
geodesic. 
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It follows that the curve C, is a geodesic if and only if £,=0, and C, is 
a geodesic if and only if G,=0. Moreover from (6) we see that if the curves 
of the given net are geodesics the central lines (and intersector normals) are per- 
pendicular. 

Suppose that in the system (3) the following condition is satisfied: 


(29) A = D,Di' — D,D{' = 0. 
If we make the transformation 


Dy + Dou D2» — 


X= 


on system (3), we find that the system assumes the form 
= ax, + Bx, + (D? + D?)'?X, 
+ dbx», 
= + 6x, + 
(D? D?)+/2 
(D? D?)#!? 
E 


Xu + Ay+ 


D? + D? 


(Di'?+ Dz D:D» — 


G D? + D? 


But if use be made of (29) and the integrability conditions (5), we may readily 
show that the coefficients of @ in (30) vanish. It follows therefore that a 
necessary and sufficient condition that S, be immersed in a space of three dimen- 
sions is that 

A = D,Di’ — DD{' = 0. 


Or we may say that a necessary and sufficient condition that the surface S, 
be immersed in a space of three dimensions is that the intersector normals coin- 
cide or that the central lines be parallel. 

From (17) and (15) we note that the surface of normal projection Sj de- 
fined by (16) is a minimal surface if and only if 


(31) M, = D,/E + Di'/G = AM, — 


It follows therefore that if two surfaces of normal projection taken in per pendic- 
ular directions are both minimal surfaces, all surfaces of normal projection are 
minimal, and moreover the surface is a minimal surface immersed in a space of 
three dimensions. If not both of M, and Mz are zero, there exists just one normal 
to the surface at x which determines a minimal surface of normal projection. 
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The central lines are parallel if and only if (29) holds. Hence if the sur- 
face is not immersed in a space of three dimensions the central lines intersect 
in a point. This point and the point x determine a unique normal which de- 
termines a surface of normal projection which has the point x as an umbilical 
point. 

Again from (17) we see that there exist exactly two surfaces of normal pro- 
jection which are developables. The normals determining these developable sur- 
faces have direction cosines proportional to 


(32) Dor = Dip, 


wherein D, and D, are the coefficients of the canonical system defined by (20). 
We shall call these normals the developable normals. The developable normals 
are each perpendicular to one or the other of the intersector normals. The pair 
of developable normals and the pair of intersector normals each make equal angles 
with the principal normals, and hence are paired in involution with the principal 
normals as double lines. 


5. CONGRUENCES CONJUGATE TO THE GIVEN NET 


A congruence is said to be conjugate to a given conjugate net if the de- 
velopables of the congruence intersect the sustaining surface of the net in the 
curves of the net. It is well known that if a line g generates a congruence G 


conjugate to a net, any point y on g generates a surface S, such that the 
tangents to the curves on S, corresponding to the curves of the given net and 
the tangents to the curves of the given net are coplanar. 

Let us find the condition that the two-parameter family of lines generated 
by \ generate a congruence conjugate to NV,. Let y be any point on \. The 
coordinates of y are defined by an expression of the form 


y=xt rAd, d#0. 
We find readily that 
Yu = (1 + mid)xy + duly — x)/d + Amd, 
yo = (1 + m,d)x, + — x)/d + Bud. 
Hence the lines \ form a congruence G conjugate to the given net if and 
only if 
(34) B, = 0. 


(33) 


But from (7), we see if (34) is satisfied that A.=0, B,=0. We see therefore 
that if a given normal generates a congruence conjugate to the given orthogonal 
conjugate net, the unique normal perpendicular to the given normal also generates 
a congruence conjugate to the net. 
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The normals X defined by (16) will generate a congruence conjugate to NV, 
if and only if 


(35) 


If we let 
A = cos 8, B = sin 8, 
we find that (35) may be written in the form 
(36) A, -6=0, B,—6,=0. 
The integrability condition of system (36) is 
(7 bis) — By = 0. 
Hence (36) may be solved by a quadrature. Suppose the transformation (16) 


with A and B determined by (36) has been effected on system (3). The result- 
ing system will be of the same form as (3) with 


(37) Ay = 0, B, = 0, Ao = 0, By = (0). 


The differential equations (3) characterized by (37) are left unchanged in 
form by the transformation (16) with constant A and B. We may state our 
results in the following form: 

All congruences 2I conjugate* to the given orthogonal conjugate net may be 
found by quadratures. If each of the lines of a congruence conjugate to the given 
net is rotated through the same constant angle in the normal plane at each point 
x of S., the resulting two-parameter family of lines forms a congruence conjugate 
lo the net. 

Let us suppose that S, is not immersed in a space of three dimensions. 
From (3) we find that 


AN = xu, — + (yD2 — aDd')xy + (6D2 — BDE") xp, 


(38) 
Au Di? tun + DX + (aD{’ yD) Xu + (BD{’ = 6D,)x». 


If we differentiate the first and last of (3) with respect to u and v respectively, 
and use (38) and (3), we find that the four functions x satisfy a system of 
third-order differential equations of the form 


= Piruu + + pi Xu + qi Xe, 


39) 
= PeXuu + + pe Xu 4- Xe, 


wherein g, and #2 are defined by the expressions 


* L. P. Eisenhart, Transformations of Surfaces, Princeton University Press, p. 168. 


A, = Ay + AB AB, = 0, 

B, = Bi + A,B — AB, = 0. 
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Aq: = — D2Dy, + Ai(D? + D?), 
(40) 


Ap: = — D2'Dy, + By(Di” + D2”). 
Two of the integrability conditions for the system composed of (39) and 
the second of (3) are 


(41) + + qi = Pipe + pou + = dpe. 


It follows therefore that the curve C, is a plane curve if and only if g,=0, 
A+<0, and the curve C, is a plane curve if and only if p.=0, A+0. 
Suppose both C,, and C, are plane curves. We may write the conditions 
gi=0 and p.=0 in the following form: 
0 Dz 
— arctan — = — Aj, 
1 


2 
—_ = — 
1 


(42) 


It follows from (7) that 


Oudv 


De 
(43) arctan — — arctan —-| = 0. 
D Di! 


1 


Let the angle between the intersector normals be denoted by 7. We may 
readily verify that equation (43) may be written in the form 


(44) Iu» = 0. 


Suppose the congruence of normals \ is conjugate to the given net, and 
let C,, be a plane curve. We may write the first of (42) in the form 


Ty. = 0, 


where J, is the angle between the intersector normal and the normal A. Hence 
the angle between the intersector normal of a plane curve of an orthogonal con- 
jugate net and any line generating a congruence conjugate to the net is a constant 
for points on S, along the plane curve. 

The intersector normal corresponding to the curve C, has direction cosines 
defined by expressions of the form 


Dr + Dop 


(D? + D?)*!? 


We find readily that 
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(D? + D?)*'!?x, qi(Dor Dy) 
E  A(D? + 
(D,D{’ + 
G 
— + + — Du) 
(D? + D#?)8/2 


This intersector normal is therefore conjugate to the given net if and only if 
(46) qi = 9, D:Diy — D\D2, + Bx(D? + D?) = 0. 


But if use be made of the integrability conditions (5) the second of (46) may 
be written in the form 


The intersector normal of a curve of an orthogonal conjugate net on a surface 
not immersed in a space of three dimensions generates a congruence conjugate 
to the net if and only if the given curve is a plane curve and a geodesic on S;. 
If moreover the central lines are orthogonal the intersector normals of one curve 
of the net are parallel as x moves along the other curve of the net. 
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COVARIANTS OF r-PARAMETER GROUPS* 


BY 
C. C. MacDUFFEE 


1. Introduction. The theory of covariants of r-parameter groups has not 
had the development that the theory of invariants has had. Apparently this 
is because the methods used have been almost exclusively those of Lie, in 
which the concept of invariant is central. In projective geometry the covari- 
ant is surely of as great importance as the invariant, and in differential ge- 
ometry the differential covariant or tensor is basic. It would appear desirable, 
then, to attempt some new approach to this subject, an approach having the 
covariant rather than the invariant as the fundamental concept. 

The present purely algebraic treatment is based upon a somewhat novel 
concept of what a covariant is. The approach seems to be justified by the 
fact that the fundamental theorem of covariant theory (Theorem 6), namely 
that every invariantive property can be characterized by the vanishing of 
covariants, follows immediately from the definition of covariant. Perhaps 
there is some justification in claiming that the approach is durchsichtig. 

The ordinary projective invariant theory arises upon specializing the 
r-parameter group to the linear homogeneous group. That every projective 
property can be characterized by the vanishing of absolute covariants in 
cogredient sets of variables follows at once. The reason that exactly m sets 
are required is apparent. 

The application of the theory to differential forms indicates that the ten- 
sor analysis is more restrictive than is necessary. There are covariants which 
do not obey the tensor law, and yet seem to be of as much use and importance 
as tensors. While some writers have employed such covariants, their use has 
not become general. 

Many of the concepts in this paper were first applied in the projective 

‘theory by J. Deruyts and A. Capelli, whose works are cited. 

2. The parameter groups. Let 


or more simply 


(2. 1) G,: a’ = f(a; eo file’; 


be an r-parameter group of transformations where r and » are finite or de- 


* Presented to the Society, April 20, 1935; received by the editors August 4, 1934. 
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numerably infinite. It is understood that the variables a= (a, a2, - - - , 
and the parameters £=(£), &, ---, &) range over a field §, and that the 
functional values f; are also in §. We shall denote by & values of the parame- 
ters which give the identity transformation, and by &_; values which give a 
transformation inverse to that with parameters ~. We shall ordinarily as- 
sume that the parameters are essential. 

Consider the two transformations 


T: a” = f(a';n), T’: a” = f(a; $), 


such that 7” is the resultant of T and G,. Since these transformations belong 
to a group, there exists a functional relation 


(2.3) = g(n; &) 


among the parameters. We may look upon the £ as numbers associated with 
the variables a’, namely the parameters of a transformation which represents 
the a’ in terms of the a. The ¢ are similarly associated with the a’’. Under 
transformation 7, the are subjected to the induced transformation (2.3). 
Hence all transformations (2.3) constitute a group on the é with parameters 7. 

Let x, %2, - - - ,x,benewindependent variables, and define x/ , a7, - - - 
by the equations 


(2.4) P,: x’ = g(n; x). 


We shall call P, the first parameter-group of G,.* 

Equations (2.1) and (2.2) can be looked upon from another angle. We may 
consider the 7 as numbers associated with the variables a’, namely the pa- 
rameters of a transformation which represents the a’’ in terms of the a’. The 
¢ are similarly associated with the a. Under (2.1) the 7 are subjected to the 
induced transformation (2.3). Hence (2.3) constitute a group on the 7 with 
parameters &. 

Let be new independent variables, and define uy, 
uz,---,, by the equations 
(2.5) Pi: uw = g(u;é). 

We shall call P/ the second parameter-group of G,.t 

If in particular = 7_,, we shall say that the u are contragredient to the x. 
That is, if x’ = g(n; x), then 
(2.6) u’ = g(u;n-1), or u = g(u’; n). 


* Lie-Engel, Theorie der Transformationsgruppen, Abs. 1, Teubner, 1888, p. 401. 
7 G. Kowalewski, Linftihrung in die Theorie der kontinuierlichen Gruppen, Leipzig, 1931, p. 131 
L. P. Eisenhart, Continuous Groups of Transformations, Princeton, 1933, p. 31. 
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THEOREM 1. The first parameter-group of P,, P? is isomorphic with P,, P;, 
respectively. The second parameter-group of P,, P} is isomorphic with P;, P,, 
res pectively.* 


Let us write 
T: a’ = f(a; ), S: a” = f(a’;n), 
7. a” = é’), : = f(a’; n’), 
7. = f(a; = f(a"; 6). 
From 7, T’, and S, we have 
= g(n: &). 
From 7, 7’’, and S’, 
== g(y’; &), 
and from 7’, T7’’, and R, 
= (0; 
These are transformations on the with parameters n, 7’, and 6. Their first 
parameter-group is 
n’ = g'(6;n). 
But directly from S, S’, and R we have 
n’ = g(0; n) 


as the relation by which the 7’ are defined in terms of the 7. Thus the first 
parameter-group of P, is isomorphic with P,. 

The rest of the theorem may be proved similarly. 

3. Concomitants. Let us consider the group 


(3.1) G,: a’ = f(a; &) 
and the parameter-groups 

(3.2) P,: x’ = x), 
(3.3) Pi: wu’ = g(u; &1), 


with the variables « contragredient to the variables x. 
The concept of invariant of a group is well established. If 


, dn) = F(a) 


is any function of the variables such that F(a)=F(a’) is an identity in the 


* Lie noted that the (first) parameter-group is its own parameter-group. S. Lie, Videnskabs- 
Selskabet i Christiania, Forhandlinger, 1884, No. 15. 
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a and the ~ when the a’ are replaced by their values as given by (3.1), the 
function F(a) is called an absolute invariant of G,. 

The following formulation of the concept of covariant is believed to be 
new. If F(a; x) =F(a’; x’) is an identity in the variables a and x and the 
parameters £ when a’ and x’ are replaced by their values as given by (3.1) 
and (3.2), then we shall call F an absolute covariant of G,. 

Similarly, if F(a; «)=F(a’; u’) holds identically by virtue of G, and P;, 
we shall call F an absolute contravariant of G,. 

More generally, if F(a; x; u)=F(a’; x’; u’) holds identically in all the 
letters involved after a’, x’, and u’ have been replaced by their values as 
given by G,, P,, and P/, we shall call F an absolute concomitant of G,. Thus 
the concept of concomitant includes invariant, covariant, and contravariant 
as special instances. 

A concomitant involving only the x and u is sometimes called an iden- 
tical concomitant. 

4. Structure of covariants. We prove the following theorem. 


THEOREM 2. Let F(a) be any function. Let F(a) become G(a’; £) under G,. 
Then G(a; x) is a covariant of G,. 


The function F(a) is called the source* of G(a; x), and we shall write 
G(a; x) = [F(a)]. 
Let F(a) become under (2.1) and (2.2) T’ 
F(a) = G(a’; = G(a”; 
These expressions are identical by virtue of (2.2) T and (2.3). Since the x 


are cogredient with the n, 
G(a’; x) = G(a”; x’) 


holds identically in 7 by virtue of (2.2) T and (2.4). Hence by a change of 
variables 


G(a; x) = G(a’; x’) 
holds identically in — by virtue of (3.1) and (3.2). 
The covariant G(a; x) = [ F(a) | is uniquely defined by F(a). 
THEOREM 3. Every covariant has a source. 
Let G(a; x) be a covariant, and denote G(a; &)) by F(a), where £& are the 
values of the parameters which reduce (2.1) to the identity. We shall show 
that G(a; x) = [F(a) J. 


* For a development of the projective theory along these lines, see J. Deruyts, Essaé d’une 
Théorie Générale des Formes Algébriques, Liége, 1890. 
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If G(a; x) is a covariant, then by the substitution of cogredient variables 
we obtain 
G(a’; £) = G(a”; 


holding by virtue of (2.2) T and (2.3). In particular set ¢ ={o so that a’’=a. 
Then 
G(a’; ) = G(a; $0) = F(a) 
holds by virtue of (2.1). This shows that G(a; x) is obtainable from F(a) ac- 
cording to the procedure of Theorem 2. 
The source of G(a; x) is unique. 


In particular the » functions a, dz, - - - , a, are sources of the elementary 
covariants 


[ai], [a2],---, [an]. 
THEOREM 4. Every covariant is a function of the elementary covariants. 
Let 
G(a; x) = | F(a)] = F(a, a2, , 
Then 
G(a; x) = F([a1], [a2], --- , [an]), 


for one may use (2.1) on F(a) and then replace the parameters by the x, or 
replace the ¢ in (2.1) by the x and use the result on F(a). 

Similar results hold for contravariants. Let F(a) be any function, and let 
F(a’) become G(a; ~) under G,. Then G(a; #) is an absolute contravariant of 
G,. The analogs of Theorems 3 and 4 hold for contravariants. 


THEOREM 5. The functions gi(u; x) determined by P are identical concomi- 
tants, and every identical concomitant is a function of them. 


Let us take x’ =g(; x) as G,. By Theorem 1 the second parameter-group 
is still (2.6). Hence by the preceding paragraph every gi(u; x) is an absolute 
contravariant of G,, and hence a concomitant of the original group. 

Since the gi(u; x) are the elementary contravariants of G,, every contra- 
variant of G, (that is, every identical concomitant of (2.1)) is a function of 
them. 

The same concomitants are obtained by finding the elementary covariants 
of (2.5). 

5. Characterization of geometric properties. One of the major problems 
in the application of invariant theory to geometry is the characterization of 
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geometric properties by means of invariants or covariants or tensors. The 
following theorem gives a general solution of this problem. 


THEOREM 6. A necessary and sufficient condition in order that 
¢1(a) 0, $2(a) 0, =0 
shall hold for all coordinate systems is that the covariants 


[¢:(a)], [¢2(a)],--- , 


vanish identically in x. 


If ¢:(a) =0 holds for all coordinate systems, then the functions y;(a; &) 
obtained by using (2.1) in ¢;(a’) must vanish for all values of ~. That is, 


vila; x) = [¢.(a)] 
must vanish identically in x. 

Since ¥.(a; £0) =¢,(a), the condition is sufficient. 

If a system of equations ¢;(a) =0 characterize a geometric property of G,, 
it is not necessarily true that there exists a set of covariants having the func- 
tions ¢;(a) as their coefficients. But by Theorem 6 the coefficients of the co- 
variants [¢,(a) ] when set equal to zero constitute a system of equations which 
also characterize the geometric property. It must be true, then, that this lat- 
ter system is equivalent to the system ¢;(a) = 0. Thus the problem of putting 
the system ¢;(a) =0 into covariant form has been solved. 

It is thus evident that contravariants and mixed concomitants are not 
essential in geometry. Indeed, it is evident from the reciprocal relationship 
of P, and P; that the theories of covariants and contravariants are coexten- 
sive. The use of contravariants, however, is often a matter of great con- 
venience. 

6. Relative covariants and contravariants. In this paragraph we shall as- 
sume that § is the complex field, and that the functions f; have differential 
coefficients of the first order. 

In order that a transformation of type (2.1) be non-singular, that is, have 
an inverse, it is necessary and sufficient that the jacobian 

Ha’ da, 
Oa, 
be different from zero. Since G, consists only of non-singular transformations, 
those sets of values of ---, Gn, £1, &, & which make J =O are 
excluded. 
A function F(a; x; u) such that 


F(a’; x’; u’) = [J(a’; a) }*F(a; «; u) 
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is called a relative concomitant of weight u. Relative covariants and relative 
contravariants are special instances. Since J <0, the identical vanishing of a 
relative concomitant is invariantive. 


THEOREM 7. If J(a’; a) is written as C(a’; &) by means of (2.1), then C(a; x) 
is a relative covariant of weight 1. If J(a’; a) is written as D(a; &) by means of 
(2.1), then D(a; u) is a relative contravariant of weight —1. 


Consider transformations (2.1) and (2.2). Since J(a’’; a) is a function of 
a and ¢, we can use (2.2) T’ to write 


J(a"; a) = C(a”; 
Similarly from (2.1) 
J(a’; a) = C(a’; &). 
Now from the familiar relation 
J(a”; a) = J(a”; a’)J(a’; a), 
we have 
C(a”; ¢) = J(a”; a’)C(a’; &). 
This is an identity by virtue of (2.2) and (2.3). Hence 
C(a”; x’) = J(a”; a’)C(a’; x) 
is an identity in 7 by virtue of (2.2) and (2.4). That is, by a change of varia- 
bles, 
C(a’; x’) = J(a’; a)C(a; x) 
is an identity in ~ by virtue of (3.1) and (3.2), and C(a; x) is a relative co- 
variant of weight 1. 
Similarly if we write 


J(a";a) = D(a;f), J(a”"; a’) = D(a’; n) 


_ by means of (2.2), we have 
D(a; = D(a’; n)J(a’; a) 
holding by virtue of (2.1) and (2.3). That is, 
D(a’; u’) = [J(a’; a) u) 
holds by virtue of (2.1) and 


u = g(u’; 


Since these concomitants never vanish, multiplying an invariantive equa- 
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tion by a power of one of them does not alter the geometric nature of the 
equation. 


THEOREM 8. Every relative covariant of weight » can be represented as a func- 
tion of the elementary covariants multiplied by [C(a; x) |*. Every relative contra- 
variant of weight can be represented as a function of the elementary contra- 
variants divided by |D(a; 

For if F(a; x) is a covariant of weight u, F(a; x)/|[C(a; x) |* is absolute, 
and can by Theorem 4 be represented as a function of the elementary co- 
variants. 

7. Covariants of the projective group. Let G, be 


(7.1) = 30; (i => 1, n) 
j=1 


where the £;, are independent except that J(a’; a) =| &:;| #0. Corresponding 
to (2.2) we have 


n n 
af’ nia}, af’ 
j=1 j=1 


Corresponding to (2.3) we have 


(7.2) Sa = niki. 

j=1 
If we define the new independent variables x{*) in accordance with §2, we 
have 


n 


7.3 x (4, k = 1, 2, n). 
j=1 


Thus for every k we have a group of the same form as (7.1). Since the x’ 
are functions only of the x), we say that (7.3) is intransitive, breaking up 
into m blocks or sets of intransitivity.* Thus 


THEOREM 9. The first parameter-group of (7.1) on the n? variables x{* is 
intransitive, and each of its n sels of intransitivity is isomorphic with (7.1). 


This result does much to explain the simplicity of the projective covariant 
theory. It has long been known that the covariant theory using a single set 
of m variables is inadequate for geometry, while » cogredient sets are suffi- 


* Miller, Blichfeldt and Dickson, Theory and Applications of Finite Groups, Wiley, 1916, p. 206. 
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cient. Indeed, if relative covariants are used, »—1 cogredient sets suffice.* 
Since J(a’; a) =| &-.|, the relative covariant C(a; x) of §6 is X=|x,|. 
To obtain the elementary covariants, write (7.1) in the form 


1 n 
J(a’; a) 
where =;; is the cofactor of £,; in (&,.). It is evident upon the replacement of 
§4 that 


THEOREM 10. The elementary covariants of (7.1) are [a;]=P:/X where 
X= |x,| and P;is obtained from X by replacing the elements of the ith column 
by the a. 


The second parameter-group of (7.1) may be written 


n 
j=l 


It follows now from Theorem 5 that 
THEOREM 11. The functions 
xf) (i,k = 1,2,---,m) 
j=1 
are identical concomitants, and every identical concomitant is a function of them. 


8. Covariants of algebraic forms. Consider a system of / algebraic forms 


The linear homogeneous transformation 


j=1 


induces upon the coefficients of the ground forms ¢; a set of transformations 


k=1 


where the 8;;, are functions of the &. 

By an invariant or covariant of the forms ¢; is meant an invariant of the 
induced group (8.2), or of (8.2) and its first parameter-group. The only role 
of the ground forms ¢; is to determine the transformation (8.2), which is our 
G,. 

* See, for instance, Clebsch, Abhandlungen, Gesellschaft der Wissenschaften zu G5ttingen, vol. 
17. Capelli, Atti, Accademia Nazionale dei Lincei, (3), vol. 12 (1882), pp. 529-598. Deruyts, loc. cit., 


introduction. A short proof was given by the writer, Bulletin of the American Mathematical Society 
vol. 29 (1923), p. 32. 
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Since the groups (8.1) and (8.2) are isomorphic, their first parameter- 
groups are identical, namely (7.3). 
If 


- 


is one of the ground forms, that set of intransitivity of (8.2) which it induces 


is in solved form ais 
= 

where Bi (€) is a polynomial in the é of degree p in the &m, of degree 
in the £2, etc., and of degree 7 in the £1,, of degree j in the £4, etc. In fact, 
[a,00-..] is the ground form ¢ with each x; replaced by x;, [@ono...] is@ with 
each x; replaced by x,, etc. It is not difficult to see that [ap, np, o-..] is, ex- 
cept for a numerical factor, the pth polar of [don0...] with respect to the x, 
and the (n—>/)th polar of [an00...] with respect to the x®. Further, all the 
elementary covariants are, except for non-zero numerical factors, the ground 
forms in the m cogredient sets of variables and their polars. Every absolute 
covariant is a function of these.* 

9. Covariants of the general functional transformation. Consider the group 


(9.1) = fi(%1, , Xn) 
where the f; range over all analytic functions of the complex variables 
Xi, %2, - Subject to the restriction that J(x’; x) 40. This group may 
conveniently be written 
p=0 i,!ie! in! 


where 


(4) OP? x; 


We may look upon this as a group in infinitely many parameters, the c and 
the £, of which the latter are unessential in the sense that the c are functions 
of them. 

The transformation on the parameters corresponding to (2.3) is 


=] 


* A. Capelli, Lesioni sulla Teoria delle Forme Algebriche, Naples, 1902, p. 247. 


80 
— 


1936] COVARIANTS OF r-PARAMETER GROUPS 81 
07x; Oxi’ Ox} dx} 
03x}! Ox}! 
OX, i Ox} OX, OX, t 


Out of deference to convention we shall call the new variables of the pa- 
rameter-group differentials, and denote by the symbol 


d;,di, dj 


that variable which corresponds to the parameter 


xi 
OX; ,0X;, Oxi, 


Then the first parameter-group of (9.1) is 


xf = > Sa), 
Ox/ 
= — d,x;, 
OX; 
x! 
0%; 9% 5,0%;, 


Ox H 


i i 


The theory of sources carries over intact to this situation. The elementary 
covariants have as their sources the partial derivatives (9.3). 


THEOREM 12. If [@] is an absolute covariant whose source is $, then d is 
an absolute covariant whose source is 0¢/0x,. 


For 


and therefore 


With Theorem 4 this gives 


do > Op Ox, 
ax, a% 
0g 
Fa = >> — d,«, = 
OX, p 
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THEOREM 13. Every covariant is a function of the elementary covariants |x;| 
and their differentials. 


10. Covariants of differential forms. A differential form of order k and de- 
gree n is a polynomial in the differentials (9.4), with coefficients which are 
functions of the x, such that the sum of the products of the orders and de- 
grees of the differentials in each term is m, and at least one differential of 
order & is present, and none of higher order. Thus the right members of (9.4) 
are forms of orders 0, 1, 2, - - - , etc. If in these forms the partial derivatives 
are replaced by arbitrary functions, the results are general forms. 

In the invariant theory of differential forms, certain forms and “associ- 
ated functions” (which are merely forms of order 0) are taken as ground 
forms. The group G, is the group induced on the coefficients of these ground 
forms and their differentials by (9.4). It is evident that every absolute co- 
variant of this system is a function of the elementary covariants, whose 
sources are the coefficients of the ground forms and their differentials.* 

Classical tensor analysis is the covariant theory of a system of ground 
forms of order 1, one of degree 2 and the rest of degree 0. The quadratic form 
may be written >°g;,dx,dx;, and the associated functions ¢;. A covariant 
tensor is a system of functions of the g;;, ¢; and their partial derivatives 
which are the coefficients of a differential covariant of order 1. It has not 
been the practice to recognize covariants of higher order,t although various 
differential operators d,, d2, etc., are employed. Upon differentiation of a 
covariant of order 1 and degree k, there results a covariant of order 2. By a 
process known as covariant differentiation, it is possible to form with this 
covariant of order 2 and others whose sources are functions of the gi; and 
their derivatives a new covariant of order 1 and degree k+1. 

The important property of a tensor is that the simultaneous vanishing of 
its components is invariantive. There seems to be no valid reason for de- 
manding the “tensor law of transformation,” which means that the covariant 
shall be of the first order, for the coefficients of differential covariants of all 
orders have the above mentioned property. That the use of such covariants 
is of great value in reducing computation and in expressing new relations has 
been conclusively shown.f 

* E. Pascal, Atti, Accademia Nazionale dei Lincei, Memorie, (5), vol. 8 (1910), pp. 3-99, treated 
the covariant theory of forms of order higher than the frst by other methods. 

+ An exception to this statement is the work of E. Nother, Nachrichten, Gesellschaft der Wissen- 


schaften zu Gottingen, vol. 25 (1918), p. 37. 
t E. Nother, loc. cit. 
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LINEAR TRANSFORMATIONS IN 2,, 


BY 
F. J. MURRAYt 


INTRODUCTION 


In this paper, we study linear and closed linear transformations in &,. 
2, is the space of measurable complex-valued functions defined in the in- 
terval (0, 1) for which 
1 
0 


exists. We shall use certain results which may be found in Banach’s treatise§ 
and which are there shown for the space of real-valued functions but which 
can easily be extended to the space of complex-valued functions. We follow 
the notation of (B) in general; otherwise an explicit definition is given. 

We shall study transformations by means of their “graphs,” i.e., the set 
of pairs {f, Tf} in the product space 2,2». The graph is used in (B) at 
one place, but Banach confines his attention to limited transformations, while 
the set of linear and even closed linear transformations is known to be a wider 
class.|| The graph has also been used by J. von Neumann § to obtain certain 
results for 2. which we generalize to &,. 

The graph permits us to study lineay transformations by studying linear 
manifolds, and in §1 we obtain the relationship between the operations of 
taking the orthogonal complement of a linear manifold (M+; cf. Definition 
1.1), of forming the intersection with a space of higher index, and of closure 
in a space of lower index. In §2, we obtain the analogous results for trans- 
formations. In §3, we apply the above results to the study of the closure in 
L,, 1<p<2,of transformations in &. In particular, projections are discussed 
in terms of the closure of their ranges. We also generalize the notion of an 
Hermitian transformation on the basis of the familiar inclusion H*2 H.++ 

Tt Presented to the Society, February 23, 1935; received by the editors April 13, 1935. 

t National Research Fellow. 

§ Théorie des Opérations Linéaires, Warsaw, 1932. This reference will be denoted by (B) in what 
follows. 

|| For a consideration of linear and closed linear transformations in 2, see Stone, Linear Trans- 
formations in Hilbert Space, American Mathematical Society Colloquium Publications, vol. 15, 1932. 
We shall refer to this work as (S). 

4 Annals of Mathematics, (2), vol. 33 (1932), pp. 294-310. We shall refer to this memoir as (N). 


tt We prefer to use the perpendicular instead of the adjoint, but one class is obtained from the 
other by multiplication by 2. 
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In §4, an example is given to illustrate the application of the above to the 
study of a particular transformation. A second example shows that even for 
the bounded closure of a self-adjoint transformation in &,, the spectral theory 
does not hold in general. 

A word as to the extension to &%, of the results of (B) for real functions 
spaces. The only essential addition, it will be found, is the application of (B), 
Theorem 2, chapter IV, p. 55. It may be stated as follows. Let M be a closed 
linear manifold (cf. Definition 1.1) in %,. Let f(y) be a limited linear (complex- 
valued) functional defined on MN. Let C be such that |f(y)| <C|\y\|p, yeM. 
There exists a linear functional F defined on &%,, such that F(y) =f(y), veM, 
| F(x)| for all xe@,. 

We sketch the proof. ’, may be put in correspondence with a Banach space 
2}, of pairs of real functions {f,, fe} with the metric 
i.e., if ye&,, then y~{ fi, fo} if y= fit+ife. A linear manifold M in &,, however, 
corresponds to a linear vector subspace IN? of %, which is invariant un- 
der the operation U{ fi, fo} = { —fe, fi}, the operation which corresponds to 
multiplication by 7 in %,. Now, corresponding to f(y), we have a complex- 
valued linear functional on M?, f({fi, fe}) fi, fe}). Since 
—fe,fi})- 

We now extend by the above mentioned theorem of (B) the linear func- 
tional Rf({fi, f2}) to the whole space 2. Calling the extension R({fi, f2}), 
we let F({fi, fo}) =RC{fi, fo}) —iR({ —fe, fi}) and by elementary considera- 
tions one can show that F corresponds to a complex-valued linear functional 
F(z) on &,. 

Now since multiplication in %,, by a number of the form e®, corresponds 
to a unitary transformation in 2, one sees that the bound of R({ fi, f2}) in &? 
is exactly the bound of F(z) in %,. For if F(z) =ce*, then for 2’ =e—z =f/ +f? , 
F(2’)=R({f’, })=|F(@)|, while || This implies 
that the bound of the linear functional R({fi, fo}) on 22 is greater then or 
equal to the bound of F(z) on &,, and of course this means the equality. 

The above reasoning may be applied to f(z) to show that its bound is the 
same as that of Rf({fi, fe}) and since in the extension process the bound is 
not altered, we have that the bound of F is the same as that of f. 


1. LINEAR MANIFOLDS IN &, 


DEFINITION 1.1. A set of functions, M, such that M ¢ &,, and such that if 
f and geM, then af+bgeM where a and b are any two complex constants, is 
called a linear manifold in 2,. If Mis also closed, then M is called a closed linear 
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manifold in &,. If M is a linear manifold &,, then the closure of M in &, is a 
closed linear manifold in &,, and is denoted by [I|". If M is a linear manifold 
in X,, and if t<r, then M is also a linear manifold in &,; also if t>r, we denote 
by M-L, the linear manifold of all those functions which are in both M and %.. 
If 1/p+1/p’ =1 and if M is a linear manifold in 2,, then the set of functions f 
in such that f gfdx =9 for every geM, will be denoted by 


It follows from the Hélder inequality that M+ is a closed linear manifold. 
In what follows, 3% will denote a linear manifold. 


THEOREM 1.1. Jf M¢, is a closed linear manifold, then M-&,, -r>s, is 
closed in 2,. If M is a linear manifold in &, and t<s, then [M]*- &, 2 [M]*. 

This follows easily from the fact that if r>s, and if f,—f in %,, then f,-f 
in %,, which is shown by using the Hélder inequality. 

THEOREM 1.2. If and MEX,, then Also if 
then [M]"-Lg=M. 

Since [M]’2 M, we have [[M]"]*2[M]*. But we also have [M]* ¢[M]*, 
from the fact mentioned in the proof of the previous theorem. Hence 
Hence [[M]”]*= [Me]. Also when [M]*- by Theorem 
1.1, we have 

M = = M. 
Hence [M]"- 2,=M. 

THEOREM 1.3. If M is a linear manifold in &,, then M* is a closed linear 
manifold and (N+)+ = [M]”. If M > M’, and M’ is closed in L,, then M+ M’+. 
If M2 then M+ M’+. 

We have indicated the proof that 2 is a closed linear manifold. It follows 
readily from the definition that (M+)+ 2 [M]*. Now suppose there is a func- 
tion fe&,, f~0, which is in (M+)+, but not in [M]*. By (B), chapter IV, p. 57, 
lemma, and chapter IV, p. 64, there exists a function g(x)e%,-, such that 


1 1 
f gfdx = 1; f ghdx = 0 for all he[M]?. 
0 0 


Hence get+, but we also see that f cannot be in (M+)+, contrary to hypothe- 
sis, hence (M+)+ = [M]. 

Now it is easy to see that M+ ¢M’+. But if M>M’ and M+ =M’+, then 
mM =(M’+)+ = [M’ ]2=M’, contrary to the assumption that 
M > M’. 

THEOREM 1.4. If p2r, p’Sr’, 1/p+1/p’ =1, 1/r+1/r’=1, and M is a 
linear manifold in &,, then ({M]")+ =M+ - 
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M+-L,, is orthogonal to M considered in &, and hence to [M]*. Thus 
({M]")+ 2 M+ - But ([M]")+, which is in &,- hence in &,-, is orthogonal 
to M, and hence ([M]*)+ M+ and ( [Mr ])+ M-+- L,-. The combination of the 
two inclusions gives the desired result. 


THeEorEM 1.5. If pr, p’=r’, 1/p+1/p’ =1, 1/r+1/r’ =1, and M is a 
closed linear manifold in &,, then (M- 2,)+ = [M+ 

Now (M- 2 M+ since if feM+, feL, and is orthogonal to M- &, 
for it is orthogonal to M. Since (M-,)+ is closed, we must also have 
(M- L,)+2> [M+]. But is orthogonal to [M+ ]*’, hence to M+, and 
hence must be in (M+)+=M and of course in &,. Hence M- L,2 ([M+]’)+. 
Theorem 1.3 now implies that (M- &,)+ ¢ [M+ ]”’. This and our previous result 
imply the theorem. 


THEOREM 1.6. If the set S is everywhere dense in the linear manifold M in %,, 
and s <r, then S is evervwhere dense in [M|]*. 


S is everywhere dense in M considered in &,, since if a sequence {f,} 
is such that in then in &,. M is everywhere dense in [MN], 
hence by a well known argument © is everywhere dense in [M]*. 


DEFINITION 1.2. Let M be a closed linear manifold in &,. Let pi<p. Now 
if for all r such that pisr<p, [M]"-L-=[M]", then M is said to be of simple 
lace between p and p,. 


DeFIniTION 1.3. Let M be a closed linear manifold in &%,. Let pi<p 
and piSrSp, pisuiSrSvSp, i=1, 2, 3,---. We define 
Mo” = Ma (wr) = Me = v2) 
= [Mes (te1) Me” = (v1) g,, and for k=1,2,3,---, 


(r) (r) 
(r) (r) r 
(r) (r) 


r 
() (r) 
Me (2, = [Me (v1, Vex41) | 

+ These are the only ways in which distinct manifolds can arise by a finite number of applications 
of these processes. For if we repeat the same process after the initial steps, we get the same manifold 
as we had applied once in a suitable way, for instance, [Yt,")(u) |"*-2-=Dt (min (a, w2)). For sup- 
or which with our previous inclusion 
gives the result. But if on the other hand <1, we have 


=((m]" which with our previous inclusion proves the result. 
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Now if for some value of r such that piSr<p, there are n distinct linear 
manifolds among the Di, Me, Mtr” (u1), etc., defined above, and for no value 
of r between p, and p are there more than n, IN is said to have an n-valued lace 
between p and py. 


DEFINITION 1.4. If M is a closed linear manifold in %,, and p<h,, 
PSUs Sr Sr; F pr, we define M- Me™, [M- Vp, = Me™ (a1) 
= [M- Mr = [Mr and Me(a1, ue), Mi” v2), etc., by the 
inductive relations given in Definition 1.3. 

If, for some value of r such that pSr<p,, there are n distinct manifolds 
among the Dts”, Ms, etc., thus defined, and for no such value of r are there 
more than n, IN is said to have an n-valued lace between p and py. 


THEOREM 1.7. Let M be a closed linear manifold in %,. Let p:<p, piSrSp; 
then if 1/p+1/p’ =1/r+1/r’ =1/pi+1/pi =1, we have =r’'=p’. Let 
1/u;+1/u/ =1; then pi =p’. 
Then 


(r) (r) (r) 


(r) (r) (r’) 


(r’) 


Vox) 


(Ms (v1, = (m+), (0, Usk) 


(r’) 


(Ms Vory1))* = (M+): (04, 

We first show that 2.72 ,---, ). For Ms(v:) and 
Ms” > Mi” (us), since [M]™- L, > [M]-L,,, and since [M]™-&, is closed, 
[Me > &., = Me Also Lu, = [[M]]- 2 [Me], 
hence [IN ]- = [Mt]. L-2 [M]- Now if N is such that 
N, then NS [M]™- and since [M]- is closed, [M]- Lu, 
2[M]“, hence [RM] also [M]-k, 
> [[M]™-L,- An inductive proof will now give the desired 
result. A similar type of proof will show that Mi €M:"( ,---, ). 

It is easy to give an inductive proof of the statements concerning orthog- 
onal complements based on Theorems 1.4 and 1.5. 

The proof of the following theorem is similar to that of the above. 


Now we notice that , , vs) is the closure in of some manifold in and hence 
the above proof goes through with I2;( _,- - - , v4) substituted for M. 


(k= 1,2,--+); 
(k=0,1,---); 

thw 
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THEOREM 1.8. Let Mt be a closed linear manifold in &,. Let p=rZhpr, 
1/p+1/p’ =1/r+1/r' =1/pit1/pi =1, p’2r’=pr’, and let 1/u;+1/u/ 
=1/o/ +1/o;=1, pi, p'2u/ =r’ => pi’. Then 


(r) (r) (r) 


Me PM; ( ; 


(r’) (r) 


= = (MY): 


(r) 


(M, (11, Vex))* = vx); 


(r) 


(r’) 


2. LINEAR TRANSFORMATION IN &, 

‘The set of pairs { fi, fo}, fi and with the norm (\\f,|| con- 
stitutes a Banach space The set of linear functionals on £, X is 
simply isomorphic with 2, x &,-, 1/p+1/p’ =1 (cf. (B), p. 64, pp. 181-183). 
Since tt, X &, is simply isomorphic with &,, the discussion of linear manifolds 
in applies to X &, also. 


, 
(v1, , Venza). 


DEFINITION 2.1. A set TE L,X, ts said to constitute a transformation T 
in&,, if no two distinct pairs of Z have the same first elements. If { fi, fe} € Z, then 


we also write Tf,=f2. The set D of first elements of the pairs of & is called the 
domain of T, the set X of second elements of T is called the range of T. 


DEFINITION 2.2. A transformation T in &, is said to be linear if T is a 
linear manifold. 


DEFINITION 2.3. A transformation T in &, is said to be closed if E is closed. 


The definitions of 7,72, 7:+72, a7,, where a is a complex number, are 
standard, and will be assumed as will such facts as that if 7, and 7» are 
linear, 7:72, 7,+T7? are linear. Our procedure would be just that of (S), chap- 
ter 2. 

In what follows we shall restrict our discussion to linear transformations 
and we shall use the symbols [7']", 7: £, applied to transformations to mean 
the transformations corresponding to [IZ], T-&,x&,, when these last sets 
constitute transformations. 7, €¢ T; is to mean of course T; € To. 

A somewhat different procedure is used to define T+. 


DEFINITION 2.4. If the pairs of T+ with their order inverted constitute a 
transformation in \,,, it will be denoted by T+. The adjoint of T, T*, is defined 
as —T* when the latter exists.¢ 


+ This definition coincides, when 7 is limited, with the adjoint defined in (B), p. 99. 
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THEOREM 2.1. /f T is a closed linear transformation in &,, then T-2%,X%, 
r>s, constitutes a closed linear transformation in %,. 


This follows from Theorem 1.1 and Definition 2.1. 


THEOREM 2.2. If T is a linear transformation in &, and if for some t<s, 
|’ exists, then for every r such that t<r<s, |T |" exists and T. If 
[7 |'-2,=T then [T]|"-2,=T. 

By Theorem 1.2, By Theorem 2.1, 
|= ]*- &, constitutes a transformation. Hence by Definition 2.1 must 
also, since it is included in [T]‘-&,X,. The remaining statements of the 
theorem follow immediately from Theorem 1.2. 


THEOREM 2.3. If Tis a linear transformation in &%,, with domain everywhere 
dense in &,, and if for some t<s, [T]* exists, then for every r such that tSr<s, 
|" exists and has domain everywhere dense in &,, and [T [T]*. 


This follows from Theorems 2.2 and 1.6, since by Theorems 1.2 and 1.1, 

THEOREM 2.4. If T is a closed linear transformation in %,, and for some 
t>s, T-& has a domain everywhere dense in &,, then for any r such thatt=r=s, 
T - %, is closed and linear and has domain everywhere dense in &,. 


This follows from Theorem 2.1 and the fact that the domain of 7: <. 
includes that of T- 2, and Theorem 1.6. 


THEOREM 2.5. If T is a linear transformation in %,, T+ (and T*) exists if 
and only if D is everywhere dense in &,. If T+ exists it is a closed linear trans- 
formation. 


Since T+ consists of all pairs { f*, f} of 2, X &,-, such that 


1 1 
f ftgdx +f {Tgdx = 0 
0 0 


for all geD, one can readily obtain a proof of this Theorem analogous to the 
proof of (S), Theorem 2.6 and Theorem 2.7. 


THEOREM 2.6. If T is a linear transformation in %,, and D is everywhere 
dense, [T |” exists if and only if T+ has domain evervwhere dense in 2. 
=(T+)+ if [T]” exists.t 

If [=]? constitutes a transformation, since by Theorem 1.3, [= ]”=(Z+)+, 
(Z+)+ constitutes a transformation, hence 7+ has domain everywhere dense 
in ¥,, by Theorem 2.5. If T+ has domain everywhere dense in &%,-, then 
[=]*=(Z*+)+ constitutes a transformation by Theorem 2.5. 


+ This is of course a simple generalization of Theorem 2 of (N). 
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DEFINITION 2.5. Let p, 7, pi, Wi, vi be as in Definition 1.3. Then if T is a 
closed linear transformation in &,, we define T;( ,---, ) as the transforma- 
tion corresponding to T;( ,---, ), when the latter manifold constitutes a 
transformation. 


DEFINITION 2.6. Let p, r, pi, Ui, vi be as in Definition 1.4. Then if T is a 
closed linear transformation in &,, we define T;( ,--++, ) as the transforma- 
tion corresponding to Z;"( ,---, ), when the latter manifold constitutes a 
transformation. 


THEOREM 2.7. Let p,r, pi, p’,r', pi Ui, Ui 04 be asin Theorem 1.7. Let T 
be a linear transformation with domain everywhere dense in %,, and such that 
exists. Then 

(a) Ti ( ,-+-+,  ) exists, is closed and linear, and has domain everywhere 
dense in &,; 

(b) (T( ,--+,  ))* exists, is closed and linear, and has domain every- 
where dense in &,; 

(r’) 


ol?) 
° , = (T*)2 


(r’) 


, , 
(Ty +++ , = (T*)e (ti, , 
(r) 


(T2 (v%, = (T+); (v1, Usk); 


o>, 
(T2 (v1, +, = (T*), (v1, Vor+1)- 


That 7,” and 7, exist, follows from Theorems 2.3, 2.2, and 2.1. Theo- 
rem 2.3 also implies that 7,“ has domain everywhere dense in &,, and that 
T: 2 7,. Hence 72 has domain everywhere dense. From Theorem 1.7, 
we see that T2932 ,---, )2%,. Hence ,---, ) exists, is 
closed and linear by Definitions 2.2 and 2.3, and has domain everywhere 
dense, i.e., (a) holds. Theorems 2.5 and 2.6 and (a) imply (b); (c) and (d) fol- 
low from Definition 2.4, Theorem 1.7 and (b). 


THEOREM 2.8. Let p,r, pi, p’, 7’, pi, ui, ui , vi, 0/ be as in Theorem 1.7. Let 
T be a linear transformation in &,, such that T+ exists and T+: %,, has domain 
everywhere dense in 2,,». Then the statements (a), (b), (c), and (d) of Theorem 2.7 
hold. 


Since 7+ exists, T has domain everywhere dense in %, (Theorem 2.5). 
Since = and - &,,, has domain everywhere dense in &,,:, 
and by Theorem 1.6, 7, considered in &,,, has domain everywhere dense in 
,,, Theorem 2.6 implies that [7] exists. Hence we have satisfied the hy- 
pothesis of Theorem 2.7 and may infer its conclusion. 


(m1, Vex); 
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The proof of the following two theorems is similar to that of Theorems 2.7 
and 2.8. 


THEOREM 2.9. Let p, 7, fi, p’, 7’, pi, Ui, Ui, Vi, vf be as in Theorem 1.8. 
Let T be a closed linear transformation in &%,, such that T- 2,, has domain every- 
where dense in %,,. Then 

(a) T5( ,--+,  ) exists and is a closed linear transformation with do- 
main everywhere dense in &,; 

(b) ,---+, exists and has domain everywhere dense in 

(c) T( )2 Ti; 

(d) (Ti)* =(T*)2; (T2)* = ; 


(r’) 


(Ty (ur, +++ = 


(r 
(7; U2x41))* = (T*)s 


(73 (1, = 


(ui, , Vex); 
(11, 


(v1, 


(T2 (v1, Vox41))* (T+); (01, Vox). 


THEOREM 2.10. Let p, 7, pi, p’, r’, pi, ui, ul, vi, vf be as in Theorem 1.8. 
Let T be a closed linear transformation in &,, such that T+ exists and has domain 
everywhere dense in 2,-, and such that [T+]? exists. Then statements (a), (b), 
(c), and (d) of Theorem 2.9 hold. 


3. SKEW-SYMMETRIC TRANSFORMATIONS IN 


DEFINITION 3.1. Let 1<p<2. A closed linear transformation H with do- 
main everywhere dense in &, is said to be p-skew-symmetric if H+ ¢ H. It is said 
to be p-auto-perpendicular if |H* |» =H. 


THEOREM 3.1. If H is p-skew-symmetric in and p< pi then H-%,, 
is p,-skew-symmetric and (H- &,,)* = |H* 

Since Hi By Theorem 2.6, H+ has domain 
everywhere dense in %,-. Hence H- 2, has domain everywhere dense in &,-. 
From Theorem 2.9, we obtain that (H- 2,,)* =(H2»)+ = (H*),@? = [H+]. 
From H+ ¢H-,,, and the fact that H-,, is closed, we see that (H- &,,)* 
cH-&,,. 

THEOREM 3.2. If H is p-skew-symmetric in %,, [iH*]? is an Hermitian 
transformation in and (|iH* ]*)* =iH - [iH* |? is self-adjoint, if and only 
if H-% = [H*]?. 

({iH+ |*)* = —([iH+ ]?)+ =i( [H+ ]?)+ =7H-&, by Theorem 3.1 and Theo- 
rem 2.6. By Theorem 3.1, iH- %2 {iH*]?. Since H+ has domain everywhere 
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dense in %,-, [H+]? has domain everywhere dense in %:, by Theorem 2.9, 
as above. Hence [iH* |* is Hermitian. 


THEOREM 3.3. When H is p-auto-perpendicular, then H- %,,=H?*. 


By Theorem 2.9, (H-%,-)+=[H+]*"=H. Hence by Theorem 2.6, 
=H. 


THE CLOSURE IN %, OF TRANSFORMATIONS IN { 


THEOREM 3.4. Let T be a closed linear transformation in %, with domain 
everywhere dense in %. Then if 1<p<2, and pSrS2, then [T |’ exists for all 
such r’s, if and only if T+-%,- has domain everywhere dense in %,:. 


From Theorems 2.8 and 2.5, we see that if 7+ - %,, has domain everywhere 
dense, then 7,“ = [7']’ exists. If, on the other hand, [7 |” exists, by Theorem 
2.7, we see that T+. 2,,, has domain everywhere dense in &,,. 


THEOREM 3.5. Let H be 2-auto-perpendicular. Let p’=2, 1/p+1/p’ =1, 
psrs2.If H-%,- has domain everywhere dense in and [H - &,- |? =H, then 
|H |" exists and is r-auto-perpendicular. 


[|H |" exists by Theorem 3.4 and we shall show that it is r-auto-perpendicu- 
lar. Now H. Hence H=[H-&,-]? and [H-&,-]" 
=|H]’. Since H is 2-aito-perpendicular, Ht=H. Now by Theorem 2.7, 
=(H,)+ = =H" =H-&,,. Since we have 
that |H|* is r-auto-perpendicular. 

It should be pointed out in connection with Theorem 3.5, that if the con- 
ditions of the theorem are satisfied we cannot conclude that H is of simple 
lace between p and 2, since we are not able to infer that [H |" = [H]?- &,. 


THE CLOSURE OF PROJECTIONS IN &%, 


THEOREM 3.6. Let E be a projection in %, E+ = —E. Let M be the range of 
E in &. Then [E|» exists if and only if [M]*- [M+ ]” = {0}. 

[E]» exists if and only if —E-%,- has domain everywhere dense in &,,., 
by Theorem 3.4. The domain of E- &,. is the set in &,, of all elements in the 
form fitfe, foeM*- &, 

Now suppose Since we have 
giedx=0, for all feM+-L,-, since ge[M]”. Similarly Sighdx = 0, for all 
fieM- Hence = 0, hence the domain of E: is not every- 
where dense. Now conversely, if the set { fitfe} does not determine £,,, 
then there is a such that Sighidx = 0 and Sighdx = for all fieM- 
and foeM+-L,-, and hence ge(M- (M+ - = [Me ]”- [M]». 


THEOREM Let E be a projection in such that [E exists. Then [E|» 
is a limited transformation if and only if for every closed linear manifold IM’ 
in such that M’c M-L,», M’+([M]”)ag 
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Let the condition be satisfied. Let f be any element in &,. Let {cf}+ =; 
if and only if fe(M-L,-)+. Now if fe(M- = [Me 
then [£]?f exists and is zero. Suppose however that f is not in (M- &,-)+. Let 
M’=N-M- L,-, and since M’c M- L,-, by hypothesis there is a geM’+ - [Mt]? 
and g~0. Now there is an x in M-@,-, such that M-&,-= {ax}-+-M’. 
For let xeM-,, be such that Si fedt#0. Then if vy is any element of M- 
y- fedt)xeN-M- L,,=M’. Now S.gzdt+0, otherwise g would be in 
(M- L,-)+, since geM’+, but we know that ge[M]? and since [E]? exists and 
g~0, this is impossible, by Theorem 3.6. Let f—( fedt/ then 
S-hedt =0 and SJ =0 for all z in M’, hence =0 for all yeM- Lp. 
Hence he(M- L,-)+ = [M+ ]” and / is in the domain of [E]’. So also is g, since 
ge(M]|”. Hence f=h-+<cg also is in the domain of [E]”. Hence the domain of 
[E]|” is 2, and since [E]? is closed, it is limited. (Cf. (B), chapter III, p. 41, 
Theorem 7.) 

Now conversely, let [E]” be limited and let IN’ be any closed linear mani- 
fold such that M’c M- L,-. Let xeM- L,,, but x not in M’. There is an f in &,, 
such that /.fdt~0 and Jf. fadt=0 for all z in M’. (Cf. proof of Theorem 1.3.) 
Since [E]? is closed, limited, and has domain everywhere dense, the domain 
of is &,. Let [E]*f=g, f=g+h. Now g is an element of M’*, since feM”*, 
andt = [M-L,-]4 We also have J. gzdt¥0, since =0 
and f fidt#0. Hence g~0. Since [E]*f=g, ge[M]”. Hence 
g~0. 

4. EXAMPLES 

EXAMPLE 1. We give an example of a skew-symmetric transformation 

having a double-valued lace between 3/2 and 3. 


DEFINITION 4.1. Let T/ be the transformation in &,, such that | g, g:} €&,X &, is 
in if g=(k(y)) dex), where k(y) (1 
and lim, .o (k(y))'/?g = 0, limy.. =0. Let T/’ be the transformation in 
such that {g, is in if g=(k(y)) 

4.1.T/ ¢T/’. BothT; and have domains everywhere dense in 

T; is obviously included in 77’ and if we show that 7/ has domain every- 
where dense, then we have proved the statement. Now note that if geD/, 
then 


d 
g = — (k)'/2g, 
dy 


also that k(y) is bounded. Let S denote the set of absolutely continuous func- 
tions g with bounded right and left derivatives, such that for some e such that 
O<eS3, g(y)=(g(e)/e)y for O<y<e, and g(y)=(g(1—e)/e)(1—y), for 


t Since and (J—[E]}»)f=h. 


94 F. J. MURRAY [January 


1—e<y<1. Now S€®D,/, for every r, since one can readily show that both 
g and 7/g are bounded. Since it is readily shown that © is dense in &,, so 
are D/ and D/’. 

4.2. If 1<r<3/2, the ranges of T/ and T/’ are everywhere dense. If 
3/2<r<3, the range of T/ is the closed linear manifold of all gie%, such that 


1 
f = 0, 
0 


and that of T?’ is %,. For r=3, the range of T; and that of T/’ is the closed 
linear manifold such that (*) is satisfied. 
For %,, we recall that the range of 7; consists of all functions g,e%,, such 


that 
y 
g= f is in &,, 
1/2 


and such that 
lim = lim = 0 
yo 
for some value of c, and the range of 7/’, all gie®, such that ge®,, for some 
value of c. 
First suppose that 1 <r<3/2. Those geY, for which 


1 
gk = K(g) 


exists are everywhere dense, since they include %.. The same is true for the 
set Jt of such g’s for which K(g) =0. For given an e>0 and an f in &,, there 
exists a g such that K(g) exists and such that ||f—g|| <«/2. Since K(g) is not 
a limited functional on &,, there exists an h such that K(h) exists, ||||,=1 
and | K(h)| >2| K(g)| /e, and g’ =g—(K(g)/K(h))his such that K(g’) =0 and 
Hence 


lf — — ell + <. 


But it is easy to see that the range of T/ is exactly Nt. Thus the range of T/ 
is everywhere dense in &,, and since R/’ , sois 

Since for 3/2<r<3, also 3/2<r’<3 and (k(y))-"e,-, and K(g) exists 
for all ge,, the proofs of the given statements are extremely simple and we 
omit them. 

Since (k(y))—? is not in 2,, r=3, if for g; there is a c such that 
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1/2 


then c is uniquely determined and equals Pde. 
Thus we must consider those g,’s for which 


0 


We can easily verify that g; must be such that fgk-'/*dx =0. 
This is also sufficient for g; to be in R/ . For if g:e%, and such that K(g,) =0, 
then 


z i—z 
f gik dy = f (g1/k-/?)dy = o(xt/r’—1/8) ; 
0 0 


and 


1—z 
f = f = o( 7’—2/3) ; 
0 0 


and since 1/r’+1/r=1, 1/r’ =1—1/r=2/3, gis bounded and hence in &,, and 


lim = lim = 0. 
yl 
We have also shown that for 3<r, T/ =T;/’. 
We shall use certain results, which may easily be proved by elementary 
methods. We collect them here. 
4.3. (a) If r>3/2, h(x)e&,, then 


” 1 
f = , f = 
0 
If furthermore then = 
(x) h(x) dx = 
(b) If r<3/2,0<a<1, h(x)e&,, then 


a 
f = , f k-2(x)h(x)dx = 
a 

4.4. For ,1/r+1/r' =1. For 3/2 <r<3, 

Let us first consider the case 3<r. Then T/+ consists of all pairs {f, f*} 
in &,-, such that 


(1) f reas = 0 
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when =0 and by 4.2. Now 


1 1 z 
f = f f fds 
0 0 0 
z 
= lim f dx. 
0 


Integrating by parts the expression on the right and using 4.3 (a) and (b) 


yields 
1 1-9 y 
ff = — lim f dy. 
0 ” 


” 


Thus (1) implies 


1 y 
f fiidx = lim f f dy. 
0 70 


Let » be fixed and let 9’ be the set of all g,’s in R/ such that g,=0 for 
O<xSn, 1—nSx1. Then 


7] 
feuds = f f dy. 


This, by 4.2 and the definition of 2’, implies that there is a constant a, de- 
pending only on », such that after an inconsequential change in the definition 
of f if necessary 


f = 4 f dy 


for 7 <x<1—n. Hence for this range of x, 


d 
dy 


f = G + f 
1/2 


This last result is independent of 7. Thus we have shown that 7/+¢ 7,/’. 
But if {f, f*}e,/’ then by integration by parts and using 4.3 (a) and (b), 
we can show that (1) holds for all {g, g:} «&/. Thus 7,’ €¢7/+ and from our 
previous inclusion we infer that 7/+=T7,/’. 

One can easily simplify the above argument to prove the desired results 
for 3/2<r<3. This we omit. 
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For r <3/2, we have that 7/+ consists of all pairs {f, f*} in 2,- Xx %,-, such 
that (1) holds for all {g, g:}«T/. But 


1 1 z 
f f*adx f + f dx 
0 0 1/2 


/ 


1/2 1/2 
1/2 1/2 


1-1 z 
— lim + f 
1/2 


z 
— lim + f 
1/2 


70 n 


lim (« + f = lim = 0, 
1 


0 


lim (< + f = lim f = 0, 
10 1/2 Jo 


70 /2 


and the integral /,f*k-'"dy exists. But 


z 
lim f +f 
70 5 1/2 


1-9 z 
lim + f 
0 


” 


z 
0 


1 


0 


Hence by (1), 


1 z 
f feidx = lim f aude. 
0 70 0 


For a fixed 7>0, if we restrict ourselves to such g; in RY for which g; =0 for 
0<x<n and <1, wesee that 


since 
since 
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f = ¢k-1/2 + 
0 


One can easily show that c is independent of yn. But since f is in &-, r’23, 
considering f in the neighborhood of the origin leads us to infer that c=0, and 


f= f pth 
0 


But from the proof of 4.2, we see that fis in &,-, only if Siftkdy =0. Hence 
{f, f*}«Z,-. Thus we have shown that 7/+¢7,’. But a familiar argument 
will easily show that 7/42 7,’. Hence T/+=T7,/. We also note that this 
shows that 7,/ is closed. 

Now we have also shown before that 7+ =T7/’. Since 7,’ is closed and 
has domain everywhere dense, this implies that T,’ = 7,/++ =7/'+. This con- 
cludes the proof of the statement. 

Our theorems on the relationships of closure and the perpendicular of 
transformations now permit us to show the general closure relationships in- 
volved. 

4.5. If 3/2<t<r, then [T/]'=T!. If t<r and t<3/2, [T/]*=T?'. If 
tsr<3, If t<r, Ti Ti’ 

The statements concerning intersections are immediate consequences of 
the definitions. The rest follows from Theorems 2.7 and 2.6 and 4.4, and the 
intersection relationships. 

4.5 implies that Tf has a double lace between 3 and 3/2. It also tells us 
that 73,2 is (3/2)-auto-perpendicular, while 73,2’ -22= is not 2-auto-per- 
pendicular. 

EXAMPLE 2. We give an example of an operator, which is self-adjoint 
in %, of simple lace between 2 and p=1} (p<2), but for which two projec- 
tions in its corresponding resolution of the identity do not have transforma- 
tions as closures in p<3/2. 

Let ¢:(x) 12>x%21/8; gi(x) = 1/27 S52<1/8, =0, 
0<x<1/27. 

<(2"-!+1)-; =0, OS <(2"+1)-*. 

One can verify by direct calculation that if n<m, 


1 (2"-141)-38 1 
f ondmdx = — f + f = 0. 
0 "41973 2” 


Hence if n#m, J'nbndx =0. Also that if p<14, then 
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| 1 -( 1 ( 2? 2°? — 1 )) 
On «2/8 26 \ (29-1 4 1)-2ets 


+ 
Hence for such a #, 


lim ||¢.(x) — x-2/3|, = 0. 


Now let 


b, = max 
S/4S ps5 


and let H, be the operator in £%,, 1 $5, which is constituted by the set 
of pairs, {f, fi}, in which f; is related to f in such a manner that 


1 
fi = if den = band f 
1 0 


We shall show that for such p’s the domain of H, is %,. We have for 
any f in &,, 


— Do axdex|| =|] || S > | | »- 
j=1 k=1 Pp 


k=m Pp k=m 


1 — 
0 


m 1 1 
j=l =m 


Hence f, exists for every f and H, has domain &,. A similar proof will show 
that H, has a bound less than or equal to 1. 


If 1/p+1/p’ =1, then 


n=1 2" den n=1 2" dan 
1 
= f fllygdx. 
0 


Hence H,'2 —H,,, but since H, has domain %,, H,“=—H, and 
H* =H,,. In particular this holds for p=2; thus Hz is self-adjoint. Now if 
p>pi, 14, then since H,+-2,;= —H,’- —H,y =H,,", 


Pp 
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we have |H,|":=H,,. We also have that H,,-2,=H,. Thus the lace of H, 
is simple in the range stated. 

Now consider the case p= 2. For the resolution of the identity associated 
with //., we have that if M is the closed linear manifold determined by the 
set {do}, then E(O—) =0, E(0) is the projection on M+, and E(1) — E(0) is 
the projection on M. But since [M]?- [M+ {x-2/3}, p<3/2, we see by 
Theorem 3.6 that while H,= |H:]? is bounded for 14<p<2, there are pro- 
jections in the resolution of the identity associated with H2 which do not 
have transformations for closures in &,, if p<3/2. 
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QUASI-COMMUTATIVE RINGS AND 
DIFFERENTIAL IDEALS* 


BY 
NEAL H. McCOY 


Introduction. In the quantum mechanics, an important role is played by 
elements p and gq, either infinite matrices or differential operators, which 
satisfy a commutation rule of the form 


(1) 


where c is a scalar and is therefore commutative with both p and q. The im- 
portance of this relation has inspired the development of a considerable num- 
ber of commutation formulas for polynomials in p and gq, with coefficients 
in the complex number field.t However, for the most part, these formulas 
make no use of the fact that c is a scalar, but merely that it is commutative 
with both p and g. And although relation (1), with c a scalar, is impossible 
for elements of a finite algebra, it was pointed out in a recent paper{ that 
there do exist pairs of finite matrices A, B such that AB—BA is not zero, 
but is commutative with both A and B. Thus the various commutation 
formulas for polynomials in p and g go over at once into corresponding ones 
for polynomials in A and B. This suggests the problem of characterizing all 
algebras, and more generally all rings, whose elements are polynomials in two 
given elements £, 7 with coefficients in a suitable domain, it being assumed 
that £y — nt is commutative with both é and 7. Such rings are of some mathe- 
matical interest in that while they are not in general commutative, they are 
quite closely related to commutative rings and are perhaps in certain respects 
the most simple non-commutative rings. It is the primary purpose of the 
present paper to consider rings of this type. 

Let K denote a commutative ring with unit element ¢e.§ We now adjoin to 
K (ring adjunction) two elements £, 7 which are assumed to be commutative 
with elements of K, and are such that the element ¢ = £yn — né is commutative 
with both é and 7. This ring will be denoted by K [£, 7], and may be called a 
quasi-commutative ring over K. Different quasi-commutative rings may be ob- 

* Presented to the Society under the title On cerlain rings and differential ideals, September 5, 
1934; received by the editors January 27, 1935. 

Tt See, e.g., a previous paper, On commutation formulas in the algebra of quantum mechanics, 
these Transactions, vol. 31 (1929), pp. 793-806. 

t N. H. McCoy, On quasi-commutative matrices, these Transactions, vol. 36 (1934), pp. 327-340. 


§ Unless otherwise stated, the notation and terminology will follow as closely as possible that of 
van der Waerden, Moderne Algebra, Berlin, 1930 and 1931. 
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tained by imposing additional conditions on ~ and 7. But it is not obvious 
what new conditions are self-consistent, as well as consistent with those al- 
ready imposed. Our first problem is therefore the characterization of all quasi- 
commutative rings over K. 

In §1, we shall define a quasi-commutative ring R= K [a, 8], which has 
the property that any quasi-commutative ring K [£, 7] is homeomorphic to R. 
It follows that K[£, 7] is isomorphic to the quotient ring R/M, where M 
is the two-sided ideal in R consisting of those elements of R which correspond 
to the zero element of K[£, 7].* The problem of characterizing the different 
rings K[£, 7] is thus reduced to that of characterizing in some simple way 
the two-sided ideals in R. In order to do this, we introduce in §2 a commuta- 
tive polynomial ring R’ = K [x, y, z] whose elements can be put in a one-to-one 
correspondence with the elements of R. The significance of the correspond- 
ence between these rings is found to depend upon the notions of differential 
ring and differential ideal.t Accordingly, we discuss these concepts in some 
detail in §3, which is independent of the rest of the paper. In particular, we 
show that one of E. Noether’s decomposition theorems remains valid if all 
ideals are required to be differential ideals. 

The characterization of the two-sided ideals in R is obtained in $4. It is 
found that a set M of elements of R is a two-sided ideal in R, if and only if 
the corresponding set M’ of elements of R’ is a differential ideal of a certain 
kind in R’. If follows that there is a very close connection between the quo- 
tient rings R/M and R’/M’. Thus a number of properties of the quasi-com- 
mutative ring R/M can be determined from a knowledge of the corresponding 
properties of the commutative ring R’/M’. 

In §5, we discuss briefly the special case in which K is a non-modular 
field and K [é, 7] is a finite algebra over K. 

It may be remarked that the relations 


— nt = — ¢§ = 0, nf — = 0 


are precisely those which are satisfied by the infinitesimal transformations of 
a three-parameter continuous group with structure constants 0, 0, 1; 0, 0, 0; 
0, 0, 0. The problem discussed in this paper may therefore be considered as 
that of determining the realizations of such three-parameter groups. 

1. The ring R= K [a, 8]. Let K be a commutative ring with unit element 
e, and K[é, 7] any quasi-commutative ring over K. If we set {= £n—7, it 


* See van der Waerden, op. cit., I, pp. 56-58, for a detailed proof for the case of commutative 
rings. The necessary modification for the non-commutative case can be made without difficulty . The 
term “quotient ring” is used throughout this paper for van der Waerden’s “Restklassenring.” 

+ See H. W. Raudenbush, Jr., Differential fields and ideals of differential forms, Annals of Mathe- 
matics, vol. 34 (1933), pp. 509-517. 
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follows as a direct consequence of the fact that ¢ is commutative with both 
and » that 


s=0 


where m and m are any positive integers, the sum being extended to the 
smaller of x and m. We shall not give a proof of this formula as it follows 
readily by induction on m and n.* We remark that if 4 is any element of 
K[Eé, 7], it will be understood that h° =e. 

Each coefficient on the right of (2) is a positive or negative integer. Hence 
if we multiply (2) by e, we see that each of the resulting coefficients belongs 
to K. By a repeated use of formula (2), it is now clear that each element / 
of K [é, n] can be expressed in the form 


(3) h = (i,j, =0,1,---), 
where the coefficients ¢;;, belong to K, and only a finite number are different 
from zero. In general, the expression of / in this form need not be unique. 

We now pass to a consideration of the most general quasi-commutative 
ring over K, which may be defined in the following abstract way. Let e:;: 
(i, 7, k=0, 1, -- +) be undefined symbols, and denote by R the set of all 
finite sums of the form 


f= 


where the a;;, belong to K. If => dizneijn, we shall write f=g if and only if 
7, R=0, 1, - - - ). We now define: 
f+ =D + 
af = fa= (4; jx, ain K. 


It follows from the latter of these relations that «ef=fe=f, for all elements 
f of R. We now define a multiplication of the symbols e;;, as follows: 


j l 
(4) CijkCimn = e(— )( 
t=0 t t 
This defines a multiplication of elements of R which, by a direct calculation, 
can be shown to be associative. Hence R is a ring with unit element € = ¢ooo. 
If we set ¢100=@, €n10=8, it follows from (4) that a8 —Ba=vy, ay = ya, 
By ein =a'By*. Thus R is a quasi-commutative ring K[a, over K, 
* Born and Jordan, Zeitschrift fiir Physik, vol. 34 (1925), p. 873. 


Tt This will be recognized as essentially the method used by Hamilton to define an algebra over 
a given field. See L. E. Dickson, Algebras and their Arithmetics, Chicago, 1923, p. 22. 
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and hence relation (2) holds with é, n, ¢ replaced by a, 8, y respectively. It 
is then easy to show also that relation (4) is a direct consequence of formula 
(2). Thus by repeated use of (4) or (2) any element f of R can be expressed 
uniquely as a finite sum 

(5) f= 

with coefficients in K, and any such sum is an element of R. 

If now K[é, 7] is any given quasi-commutative ring over K, we shall de- 
note by /* the element >> a;;.¢'n*y*. Thus f* is a uniquely defined element of 
K[£, »] corresponding to the element f of R given by (5). It is clear that 
(ft+g)*=f*+g*. Let us consider (fg)*. By repeated use of relation (2) 
(with &, » replaced by a, 8 respectively) fg may be expressed as an element 
dicineiB*y* of R, and thus (fg)* =)oci;.€'n'¢*. But formula (2) as applied to 
f*g* in precisely the same series of operations will also yield )\cij¢n’¢*. 
Hence (fg)* =/*g*, and thus the correspondence f—f* is a homeomorphism 
between R and K [é, 7]. We have therefore shown that any quasi-commuta- 
tive ring over K is homeomorphic to R, and the problem of determining the 
various quasi-commutative rings over K is reduced to that of finding the rings 
which are homeomorphic to R, and this in turn is equivalent to the determina- 
tion of all two-sided ideals in R. 

If f is the element (5) of R, we may define 0f/da to be the uniquely defined 
element > ¢ia;;,0‘-'8*y* of R. It then follows by a simple application of rela- 
tion (2) that 


— Bf 


(0) 


af — fa= 


These are familiar formulas in the quantum mechanics. 

Since the product of m consecutive integers is divisible by ~!, we remark 
that no matter what the characteristic of the ring K, (1/m!) 0*f/da” is an 
element of R (n=1, 2,---). 

2. The ring R’=K [x, y, z]. Let x, y, z denote ordinary commutative in- 
determinates which are assumed to be also commutative with elements of K, 
and denote by R’ the ring K [x, y, z] consisting of all polynomials in x, y, z 
with coefficients in K. Corresponding to the element (5) of R, we have the 
element f’ = >> ai;,x‘yiz* of R’. This clearly defines a one-to-one correspond- 
ence between elements of R and those of R’. Henceforth we shall let f, /’; 
g, g’; denote pairs of corresponding elements of R and R’ respectively. 
The following may now be verified: 


af 
= 0a 
of 
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(ft 
(af)’=af’ (ainXk), 


anf’ anf’ 
da") ax” ap") ay” 
In order to find what element of R’ corresponds to a product fg of ele- 


ments of R, it is necessary to express fg in the form }>¢;;,<«‘B*y*. We shall 
now prove thatt 


(7) (fg)’ = a” 


Since differentiation is a linear operation, it is sufficient to establish this 
formula for the case in which f and g are single terms of the form F =a‘B*y* 
and G=a'8”y", respectively. We have from formula (2) 


] 
FG = = (- )( 


s=0 


and hence 


om 


— 


which is the desired result. It may be noted that formula (7) is essentially 
a formula given by Bourlet for multiplying differential operators. 

We shall also require a formula which exhibits the element of R which 
corresponds to a product f’g’ in R’, namely 


(8) ( ). 
S$! 
Again considering the case f= F, g=G, this formula states that 


gitlyitmgktn 


This is easily verified, as the term on the right given by s =/=0 is precisely the 
left-hand side, while if p>0, the coefficient of x‘+*-»yi+™—»g*+"+? on the right 
is 


+ Here, as elsewhere, the existence of s! in the denominator causes no difficulty, as (1/s!) a*f’/dy* 
represents a uniquely defined element of R’. 
t C. Bourlet, Annales de l’Ecole Normale Supérieure, (3), vol. 14 (1897). 
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pil! 
~1'(*)=0. 
ON, 


By means of these formulas we shall, in §4, find a characterization of all 
two-sided ideals in R in terms of the sets of corresponding elements of R’. 
But before proceeding to this, we pause to introduce some necessary con- 
cepts of a somewhat different nature. This will be done in the following sec- 
tion, which is independent of the rest of the paper. 

3. Rings with operators. Differential rings and ideals. Let S be a com- 
mutative ring,* and © a set of operators A with the following properties: 
(1) if s is an element of S, As is a uniquely defined element of S; (2) if s and ¢ 
are elements of S, then A(s+/) =As+At. For convenience, we shall refer to S 
as an Q-ring. A set I of elements of S will be said to be an Q-ideal if J is an 
ideal in S, and in addition J is closed under the operators A of 2.7 

We remark that from the equation A(0+0) =A0+<A0, it follows that 
AO = 0. Let now J be an Q-ideal in S, and denote by s and § corresponding 
elements of S and of the quotient ring S/J respectively. We now define AS 
to be the element As of S/J, and it follows readily that S/J is also an Q-ring. 

Let 7 denote another Q-ring. The ring T wi.l be said to be Q-homeo- 
mor phic (Q-isomor phic) to S, if T is homeomorphic (isomorphic) to S in the 
usual sense, and in addition if s—t by this homeomorphism, then As—At for 


each A in 2. The quotient ring S/J is clearly 2-homeomorphic to S. It is now 
not difficult to prove the following theorem: 


THEOREM 1. If the Q-ring T is Q-homeomor phic to the Q-ring S, then T is 
Q-isomor phic to the quotient ring S/I, where I is the Q-ideal in S consisting of 
those elements of S which correspond to the zero element of T. 


This will be recognized as a familiar result provided the symbol Q be 
omitted from the statement of the theorem.{ We note first that if s—-0 by 
the given 2-homeomorphism, then As—A0-= 0. Hence the set J of elements of 
S which corresponds to the zero element of T is actually an Q-ideal. Let now s 
be any element of S, 5 the corresponding element of S/J, and ¢ the element of 
T corresponding to the element s of S. Then, by the known case, it is clear 
that the correspondence 5—+# is an isomorphism of S/I and T and we only 
need to show that this is also an Q-isomorphism. But by our hypothesis, A¢ 


* Some of the results of this section can be extended to the case of non-commutative rings. How- 
ever, we shall simplify the discussion by considering only commutative rings, as these are the ones 
which are important for our purpose. 

t These concepts are essentially those used in the study of groups with operators. See van der 
Waerden, op. cit., I, p. 132. 

t van der Waerden, op. cit., I, p. 57. 
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is the element of T corresponding to As, and by. definition AS is the element 
of S/I corresponding to As of S. Hence AS—At?, and the theorem is estab- 
lished. 

We shall henceforth assume that the ring S has a unit element e, and that 
every ideal in S has a finite ideal basis. We shall also assume that each opera- 
tor A of Q satisfies the further condition (3): if s:, s; are elements of S, then 
A(sS2) = 5;Ase+5,As;. The ring S may then be called a differential ring, and 
an Q-ideal in S a differential ideal.* 

In a differential ring we have Ae=0. For applying the condition (3) to 
the case in which s; = 5. =, we get Ae = 2Ae, that is, Ae =0. We shall now prove 
a few theorems concerning differential ideals in S. 

THEOREM 2. An ideal I=(a,, a2, - ++, a.) in S is a differential ideal, if 
and only if Aa;=0 (1) (é=1,2,---,&). 


It is clearly only necessary to establish the sufficiency of these conditions. 
If a is any element of J, then we may write 


k 
a= b;a;, 


i=1 


where the 0; are elements of S. Thus 


k 
Aa = (b;Aa; + a;Ab;). 


i=1 


Hence if allAa;=0 (J), it follows that Aa=0 (J), and / is a differential ideal. 
If J, and J, are differential ideals in S, their least common multiple 
I,NI2=|h, Iz] is obviously a differential ideal in S. By Theorem 2, it is 
clear that their greatest common divisor (J;, J2) is also a differential ideal in S. 
THEOREM 3. If I is a differential ideal in S and I=I,NI2, where I, and Iz 


are proper ideal divisors of I such that (I,, Iz) =(€), then I, and I; are differential 
ideals in S. 


Under the hypotheses of the theorem, there exist elements 7;, 72 of S such 
that 
i; + ig = €, i; = 0 (1,1), i2=0 (Je). 


Let a be any element of J;. Then aiz=0 (J), and hence A(aiz) = aAiz+i.Aa=0 
(I), that is, ~#Aa=0 (J,). But (J,), andthusAa=0 In like 
manner it can be shown that J; is also closed under the operations of Q, 
which proves the theorem. 


* See H. W. Raudenbush, Jr., loc. cit. 
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An ideal (differential ideal) J may be said to be direct indecomposable* 
if it cannot be expressed in the form J, J2, where J; and J; are proper ideal 
(differential ideal) divisors of J such that (1, J:) =(€). The above theorem 
then states that a differential ideal J is direct indecomposable if and only if 
it is direct indecomposable when considered as an ordinary ideal. 


TuHeoreEM 4. Each differential ideal I in S can be expressed uniquely as the 
least common multiple or product of direct indecomposable differential ideals I ;: 
(9) I = Ze] = Tile- ++ Ik, 
where the I; are proper ideal divisors of I such that (I;, I;) =(€), i¥j. 

Considered as an ordinary ideal, it is knownf that J has a unique decom- 
position of the form stated, except that the J; are of course not required to 
be differential ideals. We shall show that they are necessarily differential 
ideals. 

We have from ©), Is,---, Ix]. Since (hh, (6) (j=2, 
it follows, uhat (11, [J2, Zs, ---,Zx]) (e),and hence, by Theorem 
3, I, and Is, - - - , are differential ideals. A repetition of this argument 
proves the theorem. 

If S; (i=1, 2, - - - ,r) are differential ideals in S such that each element of 
S can be expressed uniquely as the sum of elements which belong respectively 
to the S;, then S is said to be the direct sum of the differential ideals S;, and 
we write 


(10) S=S, Se + 


A differential ring which can be expressed as the direct sum of two or more 
differential ideals may be said to be reducible, otherwise irreducible. 

Suppose now that relation (10) is given. From the uniqueness of the ex- 
pression of any element of S as a sum of elements of the S;, it follows that 
S; and S; have no element in common except zero, and thus S;S, 0, i¥j. 
Considering now the unit element ¢ of S, we have the following relations: 


(11) e€= = 0 (S)), ee; = 0 (i Xj), e? = ~ 0. 


i=l 


It follows readily that S; consists of all elements of the form se;, where s is 
an element of S, and thus e¢; is the unit element of S;. 


* This is in agreement with the terminology used by O. Ore in his paper, Abstract ideal theory, 
Bulletin of the American Mathematical Society, vol. 39 (1933), pp. 728-745. 

+ E. Noether, Idealtheorie in Ringbereichen, Mathematische Annalen, vol. 83 (1921), pp. 24-66; 
van der Waerden, op. cit., II, p. 46. 

t van der Waerden, op. cit., II, p. 45. 
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Let us now assume that relations (11) are given, and deduce from them 
the decomposition (10). Let S; be the set of all elements of S of the form se;, 
where s is any element of S. Then clearly S; is an ideal in S and e¢; is the 
unit element of S;. If s is any element of S, we have 


S = se = Sey + Seo +--+ + SeE,. 


Thus any element of S can be expressed as the sum of elements belonging re- 
spectively to S; (i=1, 2, - - - , 7). Furthermore, this expression is unique, for 
if >°a,e;=0, it follows by multiplication with that aje;=0 (j =1, 2, - - - , r). 

We shall now show that S; is a differential ideal. Since Ae; = Ae,” = 2€\Ae,, 
it follows that Ae;=0 (S,). Also from (11) we find 


Ae = 0 = Ae, + Aego +--- + Ae, 


and thus Ae;=0 (=1, 2, - - - , r). If s; is any element of S;, we have there- 
fore As; =A(s;e;) = €;As;=0 (S;). Hence S; is a differential ideal and S is the 
direct sum of the S; (¢=1,2,---,7). 

We have therefore shown by a familiar kind of calculation, that a decom- 
position (10) has relations (11) as a consequence, and conversely. In view of 
Theorem 3, it is not surprising to find that if a differential ring can be ex- 
pressed as the direct sum of ordinary ideals, these ideals are of necessity dif- 
ferential ideals. We may remark here that if s is any element of S, the corre- 
spondence s—se; is an 2-homeomorphism between S and Sj. 

We conclude this section with the following theorem: 


TuHeEorEM 5. /f I is a differential ideal in S, the quotient ring S/I can be 
expressed as the direct sum of k differential ideals K;, if and onlv if I can be 
expressed in the form (9). By a proper choice of notation we have also that K; 
is Q-isomor phic to (i=1,2,---, k). 

The theorem that can be obtained from this one by omitting the word 
“differential” and the symbol “Q” is known to be true.* We shall not give 
a detailed proof of this extended theorem, as it follows readily from the known 
case by means of Theorem 3, and the fact that if S/I is reducible, the com- 
ponents are necessarily differential ideals. 

4. Two-sided ideals in R, and quasi-commutative rings. We now return to 
a further consideration of the rings R=K|a, B| and R’=K|x, y, 2] intro- 
duced earlier. If M is any set of elements of R, we shall let ’ denote the set 
of corresponding elements of R’, and conversely. 

We associate with the commutative ring R’ the operator domain 2 con- 


* van der Waerden, op. cit., II, p. 47. See also E. Noether and W. Schmeidler, Moduln in nicht- 
kommutativen Bereichen, Mathematische Zeitschrift, vol. 8 (1920), p. 11. 
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sisting of the two operators 2(0/dx) and 2(0/dy). Since R’ consists of all 
polynomials in x, y, z with coefficients in K, it is clear that R’ is closed under 
these operations. Also these operators satisfy all requirements prescribed in 
the preceding section, and thus R’ is a differential ring with respect to these 
operators. Throughout the remainder of this paper, it will be understood 
that the terms “differential ring” and “differential ideal” refer to the particu- 
lar operator domain 


The following theorem now gives a characterization of the two-sided ideals 
in R. 

THEOREM 6. A set M of elements of R is a two-sided ideal in R, if and only 
if the set M’ of corresponding elements of R’ is a differential ideal in R’. 


First let us assume that M is a two-sided ideal in R, and show that M’ 
is a differential ideal in R’. Let f’, g’ be any elements of M’, h’ any element of 
k’. Then f, g are elements of M, h an element of R. Hence 


f—-g2=0, hf =0, fh=0 (M). 
Also, by relations (6), it follows that 
‘ = 0, 
da’ 
We therefore have at once 
of’ 


Ox 


=0, = 0 


0 (M’) 
= 
oy 


and we have only to show that f’h’=0 (M’). From relations (8) we find that 


ath\’ 
f'h' = ( ). 
S! OB* da’ 


op? 
Hence the expression in parentheses belongs to Mand therefore f’h’ =0 (M’). 
Now let M’ be a given differential ideal in R’, and M the set of correspond- 
ing elements of R. From the preceding case it is clearly sufficient to show that 
if fis any element of M, h any element of R, then fh=0 (M), hf=0 (M). 
From equation (7), we have 


( ) 
Ox Oy 
| 
But 
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But since M’ is a differential ideal, 


Hence (fh)’=0 (M’), and therefore fh=0 (M). It follows similarly that 
also hf=0 (M), and M is a two-sided ideal in R. This completes the proof 
of the theorem. 

Let M’ be a differential ideal in R’ with the ideal basis f/, f’,---, ff. 
Denote by N the two-sided ideal in R with the basis fi, fo, - - - , f-; that is, V 
consists of all finite sums of terms of the form /fig, where h and g are elements 
of R. We shall write M’=(f!, f2, ---,f/), N=(fi, fo, - - -,f-). We shall now 
show that V=M. 

Let f be any element of M. Thenf’=0 (M’) and we may write 
fihi. By relation (8) it follows that 


s! 


But from (6), it is clear that 


hence f=0 (NV). Thus all elements of M are also elements of NV. Since the 
converse is obviously true, it follows that NV = M. This result may be stated 
as follows: 


TueoreM 7. If M'=(fi, ---,f/) is a differential ideal in R’, then the 
corres ponding two-sided ideal in Ris M - (fi, fo, -- +, fr). 


We now pass to an extension of Theorem 6. Let M and M’ denote respec- 
tively a two-sided ideal in R, and the corresponding differential ideal in R’. 
Let f be any element of R, and suppose f—f by the homeomorphism R~R/M, 
and f’—f’ by the homeomorphism R’~R’/M’. The correspondence f—f’ is 
then a one-to-one correspondence between elements of R/M and those of 
R'/M’. For if f=g_ (M), it follows that f’=g’ (M’), and conversely. It is 
also clear that if f—f’, zz’, then f+2—f’ +2’. 

If WV is a set of elements of R/M, we shall let W’ denote the set of corre- 
sponding elements of R’/M’, and conversely. We may then extend Theorem 6 
as follows: 
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THEOREM 8. A set N of elements of R/M is a two-sided ideal in R/M, if 
and only if N’ is a differential ideal in R'/M"'. 


First let V be a two-sided ideal in R/M, and N the set of all elements of 
R which correspond to elements of V by the homeomorphism R~R/M. Then 
N is a two-sided ideal in R, and by Theorem 6, N’ is a differential ideal in R’. 
Now N’ is the set of all elements of R’/M’ to which elements of N’ correspond 
by the homeomorphism R’~R’/M’. We shall show that W’ is a differential 
ideal in R’/M’. Let nj, if be any elements of WN’, f’ any element of R’/M’. 
Thus there exist elements mj, n{ of N’ and an element f’ of R’ such that 
nj , ning, f’—f’ by the homeomorphism R’~R’/M’. Then clearly 
ni >a! — nd, f'n'—f'n', An! , where A is either of our differ- 
ential operators. Now since N’ is a differential ideal, —nz, f'n’, are 
elements of NV’. Thus 


n’—n’' =0, fin’ =0, Ani =0 (N’). 


That is, V’ is a differential ideal in R’/M’. 

Now let WV’ be a given differential ideal in R’/M’, and N’ the set of all 
elements of R’ which correspond to elements of V’ by the homeomorphism 
R’~R’/M’'. It follows readily that N’ is a differential ideal in R’, and N isa 
two-sided ideal in R. Hence N is a two-sided ideal in R/M, and the theorem is 


established. 

If T is any ring, and each element of T can be expressed uniquely as a 
sum of elements which belong respectively to the two-sided ideals 7; (i= 1, 
2,---,k) in T, then T is said to be the direct sum of the ideals T;, and we 
write 

+T7x. 
If T can be expressed as the direct sum of two or more two-sided ideals, we 
shall say that T is reducible, otherwise irreducible. These terms have been 


defined in §3 for commutative differential rings. From the preceding theorem 
we can therefore establish at once the following theorem. 


THEOREM 9. Let M and M’ denote respectively a two-sided ideal in R, and 
the corresponding differential ideal in R’. Then 


(12) R/M =R, + Ro +--- +R, 
if and only if 
(13) R'/M’ =Ri +--- +R. 


This theorem shows that not only can the two-sided ideals in R be de- 
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termined by working in the commutative ring R’, but that reducibility of 
R/M corresponds also to reducibility of R’/M’. 

We now assume, for the remainder of this section, that in K each ideal 
hasa finite ideal basis, and the same is therefore true for the ring R’ = K [x, y,].* 
Hence if M’ is any differential ideal in R’, there exists, by Theorem 4, a 
unique decomposition, 


(14) M = 


where the M/ are direct indecomposable differential ideals such that 
(M!, M}) =(©) (¢47). Now let M; be the two-sided ideal in R corresponding 
to M!. Then we have (M;, M;)’=(M/, M!/), and (M;nM,)’=M/ nM}. 

In accordance with the definition given in §3, we may say that a two- 
sided ideal N in R is direct indecomposable if it cannot be expressed in the 
form N,N Ne, where N,; and N2 are proper two-sided ideal divisors of V, such 
that (Ni, N2) =(e). It follows at once that the M; are direct indecomposable. 
The following theorem is then an immediate consequence of Theorem 4. 


THEOREM 10. Corresponding to the decomposition (14) of M' there exists 
a unique decomposition of M of the form 


(15) M = [Mi, Mz,---, Mz], 


where the M ; are direct indecom posable two-sided ideals such that (M;, M;) =(e) 

If M has the decomposition (15), then M’ has the decomposition (14), 
and by Theorem 5, we have the unique decomposition (13) of R’/M’, and 
the R/ are irreducible. It follows from Theorem 8 that R/M has the unique 
decomposition (12), where the R; are irreducible. We thus have 


THEOREM 11. The quotient ring R/M can be expressed uniquely as the di- 
rect sum of k two-sided ideals R;, if and only if M has a decomposition of the 
form (15). Furthermore, by a proper choice of notation, Rx=R/M;.t 


It follows at once that each quotient ring R/M can be expressed uniquely 
as a direct sum of irreducible two-sided ideals. 


Coroiiary. The quotient ring R/M is irreducible if and only if M (or M’) 
is direct indecomposable. 


We now return to a consideration of quasi-commutative rings over K. In 
§1, it was shown that any quasi-commutative ring over K is isomorphic to 


* van der Waerden, op. cit., II, p. 23. 

+ Cf. W. Krull, Zweiseitige Ideale in nichthommutativen Bereichen, Mathematische Zeitschrift, 
vol. 28 (1928), p. 499. 

t See Noether and Schmeidler, loc. cit., p. 14. 
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a quotient ring R/M, where M is a two-sided ideal in R, and we have now 
given a characterization of the two-sided ideals in R. However, if M is a 
given two-sided ideal, the quotient ring R/M may clearly not be quasi-com- 
mutative over K, but over some ring homeomorphic to K. 

Let LZ denote the ideal in K consisting of the elements of M which are 
also elements of K, and denote by K the ring K/L. Let a, 8 be the elements 
of R/M to which a, 6 respectively correspond. Then R/M is a quasi-com- 
mutative ring K[a, 8] over K. Now K will be isomorphic to K, if and only if 
L=(0), and if this is true, it follows that K[a, 8] is isomorphic to K [a, B].* 
Thus R/M is a quasi-commutative ring over K, if and only if M contains no 
elements of K, except the zero. By the preceding section, any such two-sided 
ideal M in R corresponds to a differential ideal M’ in R’, which contains no 
element of K besides the zero. 

5. Finite algebras homeomorphic to R. We conclude with a few remarks 
about quasi-commutative rings over a field, which are also finite algebras over 
that field. 

Let K now be a non-modular field with unit element 1, and A a finite 
algebra homeomorphic to R, and therefore isomorphic to R/M , where M isa 
two-sided ideal (invariant sub-algebra) in R. If M contains any non-zero ele- 
ment of K, then clearly; M RR, and the homeomorphism is a trivial one. 
Hence we may assume that M contains no non-zero element of K, and by 
the homeomorphism, K corresponds to a field K, simply isomorphic to K. 
We shall consider these fields to be identical, as we may without essential 
loss of generality. 

By the homeomorphism R~A, suppose a—p, Then 
Yiai;.0'B'y*—>a;;.p'gir*. Since now A is a finite algebra over K, each 
element of A satisfies a unique minimum equation with coefficients in K, 
and leading coefficient unity.t Let f(A) 0, g(A)=0, (A) 0 be the mini- 
mum equations of p, g, and r respectively. We may now prove the follow- 
ing theorem: 


TueoremM 12. f‘(A) denotes the ith derivative of f(d) with respect to 
then we have 


= 0, 
rig®(q) = 0 (¢=0,1,---). 


Now M consists of those elements of R which correspond to the zero ele- 


* This follows readily since the constants on the right of the multiplication formula (2) are inde- 


pendent of the ring K. 
t See L. E. Dickson, Algebras and their Arithmetics, Chicago, 1923, p. 111. 
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ment of A, hence M contains f(a), g(8), h(y). Thus M’ contains f(x), g(y), 
h(z), and since it is a differential ideal, it contains z‘f (x), z‘g(y). That is, 
M contains y‘f‘?(a), y‘g‘®(8), and we have the desired result. 

It follows at once from this theorem that r is nilpotent, and hence 
h(\) =d*, where the index k does not exceed the degree of f(A) or of g(A). 
We may, however, get more information about & in the following way. Let 


fa) = 


be the decomposition of f(A) into powers of distinct polynomials which are 
irreducible in K, and denote by m the maximum of the m;. Then f(A) and 
f(A) have no factor in common, and their resultant D is not zero. We thus 
have a relation* 


a(A)f(A) + = D 0, 


where a(A) and (A) are polynomials with coefficients in K. It follows that 
b(p)f‘ (p) = D, and thus by the preceding theorem, rD =0. That is, r” =0, 
and we have established 

THEOREM 13. The index k of r does not exceed the multiplicity of the factor 
of f(A) [or of g(d)] of greatest multiplicity. 


By Theorem 9, the question of reducibility of A is equivalent to that of 


reducibility of R’/M’, and this in turn depends upon whether M’ is direct 
indecomposable or not. We may now prove the following theorem. 


Tueorem 14. Jf f(A) [or g(A) ] is expressible as the product of two relatively 
prime factors with coefficients in K, the algebra A is reducible. 


Suppose f(A) =o(A)¥(A), where #(A) and are relatively prime, and 
have coefficients in K. There then exists a relation 


(16) a(r)o(A) + BAYA) = 1. 


Let M/ =(M’', o(x)), Mi =(M’, Y(x)). Since f(x) is the polynomial in x of 
minimum degree belonging to M’, it follows that My and M/? are proper di- 
visors of M’. We have also from relation (16) that (M/, Mz) =(i). It is easy 
to show that M’= M/ n M2. Let c(x) be any element belonging to M/ n M/. 
Then we have 


c(x) = d(x)o(x) = e(x)y(x) (M’), 


and thus d(x)o(x)—e(x)~(x)=0 (M’). If we multiply this last relation by 
a(x), we find, by use of relation (16), that 


* See, e.g., van der Waerden, op. cit., IT, p. 4. 
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d(x) = [d(x)b(x) + a(x)e(x) W(x) (M’). 


Thus d(x)¢(x)=0 (M’), and therefore c(x)=0 (M’). By Theorem 5 it now 
follows that R’/M’ is reducible, and Theorem 9 then shows that A =R/M 
is reducible. 

The converse of this theorem is not in general true even for the commuta- 
tive case, as can be shown by the following example. Let K be the field 
of real numbers, and set f(A) =A?+1, g(A) =A?+1, M’=(x?+1, y?+1, 3), 
m{ =(M',1—xy), M?/ =(M’, 1+-2xy). It is not difficult to show that M/ and 
M? are proper ideal divisors of M’ and that M’=Mj n (M/, M/) =(1). 
Hence R/M is reducible. 
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CONVERSE THEOREMS OF SUMMABILITY 
FOR DIRICHLET’S SERIES* 


BY 
OTTO SZASZ 


1. Let the Dirichlet series 
(1) F(t) = 
v=1 


be convergent for ¢>0. Let, in addition, the limit 


(2) lim F(t) = s 
t>+0 

exist. It is well known that this is certainly the case whenever the series 

>>c, converges. But the converse, in general, is not true, as is shown for in- 

stance by the example A, = v, c, = (—1)”, 


F(t) = > (— = —e*(1 +e")? ast-0. 
1 
Thus we are led to the problem of finding additional conditions which to- 
gether with the assumption (2) would assure the convergence of >-¢,. 
Such conditions are the following: , 


Cn = as n—> 0.7 


n 


An 
(4a) lim sup maximum | Sn — Sm | = ¥(6) > 0 as 6 0,7 a 1, 
m2 SA (148) An 


and 


(4b) = = O(1) 


v=1 
[Landau, 6]. 
* Presented to the Society, April 20, 1935; received by the editors February 17, 1935. 
+ See Littlewood [7] in case \ny1/An—1, Hardy and Littlewood [4] in case 


Xn 
lim inf —* > 1, 


no n 


Ananda-Rau [1] for the rest. The numbers in brackets refer to the list at the end of this paper. 
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S(1+4)z An+1 


lim sup >> | cx |= as 6-0, 


n 
|Neder, 8]. 
n An+1 
(6) Do | (Ay — = O(An), p> 1, 
v=2 n 


[Szdsz, 9]. 


(7) lim inf minimum > « = (6) ~ 7 2 0 as 6— 0 


|Szdsz, 10]. 

2. The proof of (5) can be reduced to the special case A, = v without using 
the condition X,,,:/A,—-1. Thus condition \,.;/A,—1 can be omitted in (5). 
Indeed, let 


>> cr = b,, >| cx| = £B, 


w—1< 
Then for any 6>0 

b,| S B, < + 6, for > n(6), 
so that 


lim 6, = lim £, = 0, 


(8) vo 


and the series 


converges for ¢>0. Moreover 


4 


<v 
F(t) — | f(t) | — emt) | < cr 


pel 


Thus by (8) 


Fi) Be’ 5 0 as t 0, 


v= 


and, by (2), 
(2*) s as t0. 


Since 


(5) 
= > 
v=1 
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S(1+5)2 


1 
(1 + S (1 + 26)(x — 1) 2, 


S(1+6)z 


lim sup >> | 4,| S (26) > 0 as 6-0. 


Consequently >°b, converges, which in view of (8) implies the convergence 


of 

3. A similar reduction and generalization is possible for the case (4). Since 
the expression e~**‘—e~*' is >0 and monotonely decreasing in k, we have, 
by (4a), 


<’ 


sz 
xSv 


(1 — '0(1) as vp, 
Hence again 
F(t) fi) 0, s 0. 
Now on putting >»”_,b,=B, we have 


n 


sn <m 
B= >> > te; B, — Bn = Dix, n<m, 


and, by (4a), 


lim sup maximum | B,- Bn | = 0 as 6-0. 
m0 


We conclude that >>}, is convergent and from 
sr 
lim max| >> = 0 
| 


the convergence 6f >°c, follows immediately. 
The results of Neder and Landau without the assumptions \,.4:/A,—1 and 
(4b) can also be derived from (7), for 


minimum c, => — maximum 


Finally, in (6) also the restriction \,4:/An,—1 can be removed. During the 


{ For a similar argument see Ananda-Rau [3]. 
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writing of this paper there appeared an interesting paper by Iyer,* where a 
proof of this generalization is given. In the present paper I give a proof which 
is somewhat simpler, and a further generalization. 


4. In what follows it is simpler to use Laplace integrals. We intend to 
prove two auxiliary theorems which are of interest in themselves. 


THEOREM 1. Assume that 
F(t) = if A(u)e“'du converges for t > 0, 
0 
F(t)—s as t- 0, 
(9) v(x) = xA(x) -f A(u)du = — Kx, x >0O, 
0 
where K is a positive constant. Then 
1 1 
(10) — A,(x) =—f A(u)du—s as x7 @, 
x x 0 


For the proof we need three lemmas. 


LeMMA 1. Assumptions (1') and (2’) imply that the integral 


F(t) = if — A,(uje“'du 


converges for t>0 and 


(11) > s as 0. 


On integrating by parts in (1’) and using (2’) we have 
F(t) = ef A,(uje“'du > s, t- 0, 
9 


where the integral converges absolutely for ¢>0. Hence 


f (x)dx -{ f e~"*dx 
t 0 t 
1 
= f = o(-), t— 0. 
o t 


dx t 
F() — Fi) = f Pw f +t f 


Now 


* [5]. In the proof on p. 112 the author refers to his Theorem 4 in the statement of which the “O” 
is to be replaced by “o,” as is seen from the proof. 
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whence 
| — Fit)|S max, | — F(x)| + +0) 0, 
isxst 


and so, by (2"), Fi(é)—>s as t-0. 
Lema 2. If a(x)—0 as x0, then 


(12) if 0 as t- 0. 
0 


This lemma is well known. 


Lemma 3. If xb(x)—0 as x0, and 


4 


(13) f — B as t- 0, 
then the integral [°b(u)du exists as an improper integral and 


(14) f b(u)du = B. 


We have 


f = f oa — e“*)\du — f b(u)e~“'du 
0 0 0 


= H, — Ho. 
For t=1/x, x, it follows that 


1 z 
m|s—f u| b(u) | du = o(1), 
x 0 


1 x 
| H2| = o( f = o(—f = o(1). 


We now pass on to the proof of Theorem 1. From (2’) and (11) it follows that 


if —e“'du— 0 as t—> 0, 
0 u 


whence 


/ v(m) 
if ( + K K as 0. 
0 


By a well known theorem, this and (9) yield 


*/v(u) 
+K)du~ Kx as 
0 


ut 
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or 
1 1 = 
(15) — a(x) = — f du—-Oas 
x x 0 u 


Furthermore, 
d A,(x) 


v(x) = xA(x) — Ai(x) = x? 
dx 


On assuming, as we may without loss of generality, A(x) =0(x) as x0, we 
have 


(10’) = du, 


x u? 


and, on integrating by parts, 


v(u) ale) + du; n(x) = 


u? u3 


Thus (11) becomes 
1 04(7) 
(11’) F,(t) = if (= + 2f ar) e“tdu—s, t- 0. 
0 u? 0 
But on integrating by parts we have by (15) 
(16) v(x) = -f @(u)du = o(x?) as 
0 
whence, by Lemma 2, 
if + 0 as 0. 
0 
Now from (11’) it follows that 


f ( f r-tn(r)dr) e“du—s ast—0, 
0 0 


or, on integrating by parts again, 


2f (u)e“'du > s as t— 0. 
0 


Lemma 3 shows that 


2f u-*y,(u)du = s, 
0 


A. 
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whence, by another integration by parts and by (16), 


f u~*y(u)du = s. 


Combined with (10’) this proves Theorem 1. 
5. In order to apply this result to the series (1) put 
Xo = 0, 55 = 0, 
A(x) = Ss, for An S < Angi 


A summation by parts yields s, =0(e*), ¢>0, and 


F(t) = — ern!) = t f A(u)e“tdu, t > 0. 
0 


y=] 


Furthermore, for \, <An41, 


> (A, Ay—1) + (x An) Sn 
0 


n 


(x = (x — 


Thus it appears that (1/x)f (A (u)du is the typical mean (R, d) of the first 
order of the series c,. Again 


v(x) = AnSn — (A, — = An S 


v=1 


v(x) = re. 


s z 
Now assume 


(17) = = Ky, 
v=1 
Then 
v(x) Kya 


—2- = — K, S Ant, 
x x x 


and Theorem 1 yields 
THEOREM 1’. Conditions (1), (2) and (17) imply 


(18) (« - = ~ SX, AO, 
0 


iA 


| 
123 
(n = 0,1,2,---). | 
} 
| 
or ii 
(n = 1,2, 
i 
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6. Next we prove 
THEOREM 2. Let 


Av< z 
and let 


(6’) (A, — = O(A,) as p>. 


Then 
Sn, = A(An) as 
We start with the identity 
(1+6) Ay An 
= J A(u)du — f A(u)du — [A(u) — A(An) 


1 1+6 1 1 
A (u)du-——— — — A(u)du-— 
(1+ 6 6 


1+5)An 
(19) 1 + [A(w) = A(An) |du, 6>0. 


Yd, 


Here 


_k 


v= l 
( 0, An < Uu < An+1; 


and 


n+k n+k nt+k (A, 


y=n+l1 v=n+l 


By Hélder’s inequality and by (6’), 


n+k We. 
D | | = — An) 


n+1 
Since we have 


n+k 


n+1 


or 
| 
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and 
A(u) — A(An) = O16"), An< uS (1+ 
For a fixed 6 it follows from (18) and (19) that 


lim sup s, S s + O(61/?’), 


no 


and, on allowing 6—0, 


lim sup Sp 


A similar argument shows 


lim inf s, = 


n— 


which completes the proof of Theorem 2. 
Theorems 1’ and 2 immediately yield 


THEOREM 3. Condition (2) and (6’) imply that >> ; Cy converges to s. 


It is plain that (6’) implies the convergence of (1) for ¢>0. It remains 
only to observe that (6’) implies (17). Indeed 


v=1 1 


1/p’ 


n 1/p n 
( > | | "AP (A, ( > (A, 


1/p+1/p’ 


= ) = O(n). 


7. Another generalization of (6) is 
(6”) (| | — (Ay — = O(n), P> 1, 
1 


while 


(21) lim inf c, = 0.* 
It can be derived from (7), but we can prove it directly, by an argument used 
above. 

First we have 


* If Ang1/An—1, then (21) is a consequence of (6’’) as will be shown later (cf. (21’)). For the case 
An =n cf. Szdsz [11]. 
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n n n nA 1/p 
= S r( | —o)s | > (| Cr) "| 
1 1 1 


= O(A,). 
Hence by Theorem 1’, (1), (2) and (6’’) imply 


(x = f A(u)du = A\(x)~ sx, x7 
0 


Next we have 


n+k n+k 


n+1 n+1 


n+1 
hence by (20) 


| (| | = Cy) Anti (Ante An) 


— [A(u) — A(An) S ) = 


and so by (19) 
lim sup S s + O(6'/2’). 


20 


On allowing here 6-0 we get 


lim sup S S$. 


no 


Furthermore, since 


An An Xn 
(22) asa = f A(u)du — f A(u)du + [sn — A(u) |du 
1+6 0 0 


(14+8)—? 


and since s,=A(A,), 


k 
Sn — A(u) = > for < Ks, S O.- 


v=0 


Now if and =A, (1+ then 


k—1 
Cn—v =. (| Cn—v| — 
v=0 0 


n 


n—k+l1 


— 
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Furthermore, by (6’’), 


1/p’ 
n—k 


Hence under either one of the assumptions An+:/A,.—1, or lim inf,..¢, 20, it 
follows from (22) that 


lim inf s, = s — O(6'/?’), 


and on allowing 6—0, 


lim inf s, => s. 


This yields 
THEOREM 4. If (1), (2) and (6’’) hold and if at least one of the additional 
conditions 


(a) > (b) lim inf c, 2 0 


n— 


is satisfied, then jig c, converges to s. 
Notice that conditions (1), (2) and (6’’) imply 


> s-x, and lim sups, S s, 
Ay<z 


but not the convergence in general. Even the more restrictive condition 
AnCn > — K(An — An-1) (n = 3,---) 


does not imply the convergence, as is shown by an example of Ananda-Rau 
[2]. 

8. We now shall state a theorem which includes as special cases not only 
the results of Landau and Neder but also condition (3) and even Theorem 3. 
On putting 

¥,.(6) = maximum | Satk — tulle ¥n(5) = O if > + 4), 
(148) 
let us assume 


lim sup¥,(6) = ¥(6) 0 as 6-0. 


n— 
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This can be written in the form 

lim sup maximum | A(x) A(An) | = ¥(6) 0, 6-0, 

An StS 
(23) 
or 
(23’) | A(x) — A(A,)| < € for An S x S (1 + 6 = 
This condition is satisfied automatically if we have a series with gaps, that 
is, if for a constant @>1 
Anti > On (wn = 1, 2,3,---). 


Assume the conditions of Theorem 1’, so that 


A<z 
This and (23’) hold if we assume (2) and (6’). 
Now using (19) and (23’) we get 


lim sups, S s+e, liminf s, = s — e. 


n— 2 


Since is arbitrary, it follows that 


lim s, = lim A(x) = s. 


Thus we obtain 
THEOREM 5. Conditions (1), (2), (17) and (23) imply 
v=1 
9. Hardy and Littlewood have proved that from 


( —Jlaltt <a, 


v=] 


and from (2) follows the convergence of >°c,. The following generalization 
is an easy consequence of Theorem 4. 


THEOREM 6. Conditions (1), (2) and 


x 


p 
>(——)ile — <0, 
v—1 


im ply =S. 
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For now we have 


Ay — Ayer 
- = ) = 0a), 


hence condition (b) is satisfied. Moreover on setting 


Un = > — Cy)ett 


1 
n—1 


1 


(6’’) holds a posteriori and Theorem 6 is proved. 
Finally we observe that condition 


(6a) > aP (A, — = 
1 


where a, stands for | c,| or for | c,|—c,, is equivalent to the following: there 
exists a constant g>1 such that 


For from (6a) it follows that 


= O(x'-?). 
Conversely on putting x,=\,g-’(v =0, 1, 2, - - - ) we have 


> (A, — = OP AP (AR — 


and by (6b) 


aPAP (A, — = o( = > r’) = O(A,).* 


* After this paper was completed and sent to the editors, the author learned of an interesting 
paper by G. Ricci, Sui teoremi Tauberiani, Annali di Matematica, (4), vol. 13 (1935), pp. 287-308, 
where bounds for oscillation of A(x) are given, under the assumptions (2’) and A(y)—A(x)>—K 
for 
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NOTES ON LINEAR TRANSFORMATIONS, I* 


BY 
EINAR HILLE 


Under the above title the author intends to publish some investigations 
on the properties of linear transformations in abstract spaces. In the present 
note the space is a suitable subset of the set of all measurable functions de- 
fined for — «© <x<, and the transformations are of the form 


(1) K.|f] = af K (at) f(x + a>Q0. 
The results, which are somewhat loosely knit together, cluster around 
four problems. (i) The originators of zero, i.e., the solutions of the equation 


(2) = 0. 


(ii) The invariant elements, i.e., the solutions of the equation 


(3) = 


(iii) The functional equations satisfied by K.[f] for special choices of the 
kernel. (iv) The metric properties of the transformation K,|f], including prop- 
erties of contraction, and degree of approximation of f by Ka|f| for large values 
of a. The material is grouped as follows. $1 gives a survey of problems (i), (ii) 
and (iv) for a general kernel K(u)eLi(— ©, ©), K(u) =0. It lies in the nature 
of things that the results for this case are rather incomplete. They probably 
do not offer much of any novelty to the workers in the field, but serve as 
background for the discussion in §§3—4. The existence of functional equations 
obtained by superposition is established in §2, and the equations are given 
for four particular kernels which may be associated with the names of Dirich- 
let, Picard, Poisson, and Weierstrass. A closer study of the last two kernels, 
which satisfy the same functional equation, is given in §3, whereas the kernel 
of Picard is treated in §4. It turns out that the study of problems (i), (ii) and 
(iv) for these special kernels is much simplified by the corresponding func- 
tional equations. Some results on the Dirichlet kernel occur in $5, but lack 
the same degree of completeness, sharpness and simplicity. 


* Presented to the Society, April 20, 1935; received by the editors March 14, 1935. 
+ The author is indebted to Professor J. D. Tamarkin for helpful criticism. 
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1. NON-NEGATIVE KERNELS IN L; (— ©) 


1.1. We shall be concerned with kernels K (u) satisfying the following con- 
ditions: 


(Ki) K(u) is defined as a measurable non-negative function in (—©, ~). 
(K.) _K(u)du exists and equals unity. 
Let S =S(K) be the set of all functions f(x) satisfying the two conditions 


(Si) f(x) is defined as a measurable function in (— ©, ©), and 


(1.11) K.|f] = af K (at) f(x + t)dt 


exists as an ordinary Lebesgue integral for almost all x and all a>0. 
(S:) Kalf|eS(K) for all a>0 whenever feS(K). 


It is obvious from these definitions that S(K) is a linear vector space 
closed under the transformations K.. We note that if f(~)«S(K) then all trans- 
lations of f(x), i.e., the functions f(x+), also belong to S(K), and that the 
two operations K, and translation by # commute. 

1.2. Problem (i) calls for the solution of the equation 


(1.21) K.|f] = 0. 


A solution is clearly f~0. But is this the only solution? Not always, as we 
shall see. 

Let us denote the Fourier transform of g(x) by T[x; g]. Suppose that 
K(u)eL2(— © ,). It then has a Fourier transform in the same space. Sup- 
pose that f(x) is a solution of (1.21) in Ze. Then by a well known formula 


(1.22) T |x; f|T|— x/a; K| = 0. 


Here we have two possibilities. (1) 7 {—«/a; K] vanishes only in a null set. 
In this case (1.22) implies that T[x; f]~0, and consequently also f~0, so 
that f~0 is the only solution of (1.21) in Le. (2) T[—x/a; K] vanishes on 
a set S of positive measure. We can assume S to be bounded. Let g(x) be a 
measurable function which is bounded in S and vanishes outside S, and put 
f(x) =T |—x; g]. This function f(x) is in L, and is a solution of (1.21) which 
is not equivalent to zero. That this case can actually arise is shown by the 
kernel K(u) =2~-!u-?(1—cos u) whose Fourier transform vanishes for |x | >1. 

It is obvious that this method is capable of some extension, but it suffers 
from the usual limitations due to the severe restrictions which must be im- 
posed upon the function in order that it shall have a Fourier transform. The 
special kernels considered below in §§3-4 have Fourier transforms nowhere 
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equal to zero, and the particular properties of the kernels will enable us to 
prove that f=0 is the only solution of problem (i) in the corresponding space 
S(K). 

1.3. Problem (ii) calls for the fixed points of S(K), i.e., the solutions of 
the equation* 


(1.31) K.|f] =f. 


Condition (K.) shows that f=1 is a solution. In many important cases K (w) 
is an even function of u. If this is so, and xeS(K), then f(x) =x is a solution 
of (1.31) for every a>0. Consequently every linear function is an invariant. 
This case is realized for instance for the kernels of Picard and Weierstrass, 
treated below, but not for that of Poisson, because f=x does not belong to 
the corresponding space S(K). 

The method of Fourier transforms leads to the equation 


(1.32) f]T[— x/a; K] = T[x; f], 


if we assume for the sake of simplicity that K(m) and f(x) are in Zz. We have 
again two cases. (1) If T[—x/a; K]=(2r)-¥? only on a null set, T[x; f] 
must vanish almost everywhere, i.e., f~0 is the only solution of (1.31) in Ze. 
(2) If, on the other hand, T[—.+x/a; K | =(27r)-? on a set of positive measure, 
a construction similar to that of §1.2 will lead to an invariant manifold in Le. 

1.4. Let us now consider a metric space M(K) which is a sub-set of S(K). 
We shall suppose that M(K) has the following properties. 


(M;) It is a normed linear vector space in the sense of Banach, complete with 
respect to its metric. 


(M2) f(x)eM(K) implies K.[f\eM(K) for every a>0. 


(Ms) ||Kalf]|| 


We shall first consider the possibilities of finding such spaces M(K) in 
S(K). It is a simple matter to see that every Lebesgue space L,(—, ~), 
1<p<~, is a sub-space of every S(K), and the same is true of the space 
C[— ©, ] of the functions which are continuous for — © <x< That the 
customary metrics of these spaces satisfy condition (M/,) is well known, and 


* There are some passing remarks on this problem by N. Wiener and E. Hopf in the introduction 
to their paper Ueber eine Klasse singulérer Integralgleichungen, Sitzungsberichte der Preussischen 
Akademie der Wissenschaften, Mathematisch-Physikalische Klasse, 1931, pp. 696-706. They as- 
sume that the kernel K(u) vanishes exponentially for large values of |10| . In this case the method of 
bilateral Laplace transforms applies and shows that the solutions are essentially exponential func- 
tions. The discussion of the invariant elements of the Weierstrass kernel in §3.4 could have been 
made somewhat shorter with the aid of this method. [Added in proof, November 2, 1935. ] 


i 
i 
| 
th 
bi 
Hi 
i 
| 


134 EINAR HILLE [January 


in order to see that (M_.) and (M;) are satisfied it is enough to recall the follow- 
ing inequalities: 


(1.41) f | Kalf]|dx < af f | f(x + dx, 


(1.42) f | K.|f] af K(at)at f | f(x + t) 


(1.43) e.l.u.b.|Kalf]| e.l.u.b. | f]. 


The first inequality refers to the case in which feZi, and is immediate. ‘The 
second inequality presupposes feL,, 1<p<; it follows from Jensen’s in- 
equality for convex functions. In (1.43) feL., but we have merely to replace 
the essential least upper bound by the maximum in order to get the correspond- 
ing inequality for feC. 

There is consequently no lack of sub-spaces of S(K) which satisfy our 
conditions. It is perhaps also possible to find a metric satisfying these con- 
ditions which applies to the whole of S(K). Various metrics valid for the 
space of measurable functions come to mind in this connection, but these 
metrics normally fail to satisfy the condition =| a! which is a part of 
(M,). This condition is used extensively below, especially in §3.6. But this is 
actually the only part of our conditions which it seems difficult to impose on 
S(K); in particular, and (M;) do not cause any trouble. 

Condition (3) is consequently a natural assumption to make in the 
study of these kernels. Its geometric significance is that the transformation 
K.|f]| defines a contraction of the space M(K) for every fixed a>0. In special 
cases this contraction will be continuous and monotone with respect to a; 
this is the case with the particular kernels discussed below. 

1.5. It is an easy matter to show that (K;) and (Kz) imply that 


.51) lim Kalf| = f(x) 
at every point of continuity of f(«). Under certain circumstances we can also 
show convergence of K,|f| to f in the sense of the metric in M(K). 
For this purpose let us introduce the modulus of continuity of f(x) defined 


w(h; f) = ||f(x + hk) — f(x)||, 


where h is fixed, and the norm is taken with respect to x. Further, let P(u) 
and Q(u) be even continuous functions of “, monotone increasing for u >0, 
and vanishing for « =0. We have then the following 


as 
(1.52) 
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THEOREM 1.5. A sufficient condition that 
(1.53) lim ||Kalf] — f|| = 0 


for every f(x)eM(K) is that the following assumptions hold: 

(Ci) f(a)eM(K) implies f(x+h)eM(K), and ||f(x+h)|| =||f()|| for every 
real h. 

(C2) There shall exist two functions P(u) and Q(u) with the properties stated 
above such that 


for every feM(K). 
(C3) lim w(h; f)=0 for every feM(K). 


The proof of this theorem follows standard lines, and can be omitted heze. 
Let us instead consider the justification of imposing such conditions. Our 
assumptions are satisfied in L,(—», ©) for 1< p<, but not for p=~. 
Indeed, in LZ; 


f |Kalf]—f|dx < af K(ai)at f | f(x + t) — f(x)| dx 


= K,[w(t; f)], 


so that (C:) is satisfied with P(w)=Q(u)=|u|. Conditions (C;) and (Cs) 
are evidently also satisfied. If 1<p< ©, we have instead 


f | Kalf] —f|*dx < af K(atyat f | f(x + t) — f(x) |rdx 


= Kal(#(t; f))?], 


so that (C2) is satisfied with P(u) =| «|'/?, =| «|. The other conditions 
are also known to hold. If =~, conditions (C:) and (C2) still hold, but not 
(C3). Moreover, formula (1.53) cannot hold for every f(x)eL... Indeed, con- 
vergence in this space is essentially uniform convergence, and a sequence of 
continuous functions converges essentially uniformly if and only if it con- 
verges uniformly. This of course implies that the limit function is continuous. 
Since K.[f] is always a continuous function of x, (1.53) cannot hold when 
f(x) is discontinuous. In the case of C[— , » ] we have 


max | K.l[f] —f| < af K(at) max | f(x + t) — f(x) | dt 
Kalo(t; f)], 


so that all three conditions hold. 
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The assumptions of Theorem 1.5 are clearly not necessary, and various 
modifications of these assumptions could be given which would preserve 
their sufficient character. The reader who reconstructs the omitted proof of 
the theorem will find that the convergence of K.[f] to f as a> is uniform 
in any family of uniformly bounded, equi-continuous functions. He will also 
get some idea of what degree of approximation is to be expected. In the 
special cases treated in §§3-4 it is possible to find a best degree of approxima- 
tion valid for all elements of M(K) which are not invariant. 


2. SOME FUNCTIONAL EQUATIONS 


2.1. For the work of the present paragraph it is convenient to add the 
following postulate: 


(K3) K(u)eLo(— oo), 


We shall also need (Ki), (Kz), (S:) and (S:). 

For every function f(x)eS(K) we can form the iterated transformations 
K.[Ks|f]] and Ks[K.[f]], and they are also elements of S(K). We are par- 
ticularly interested in those cases in which these superposed transforms are 
expressible in terms of simple transforms K,[f], where y is some function of 
a and B. Such cases are revealed by the method of Fourier transforms. 

Proceeding formally, let us write 


(2.11) KalKslf}] -{ K(u; a, B)f(u + x)du, 


where 
(2.12) K(u; a, B) = ae [ K(as)K(B(u — s))ds. 
Then 

T |x; K(u; a, B)| = K(as)|T[x; K(Bs)] 

? 

x/a; K(u) |T[x/; K(u)], 
so that 
(2.14) K(u; a, B) = T[x/a; K(v)|T[x/B; K(v)]}, 


which can be used for the computation of the composed kernel. This formula 
is the basis of all the functional equations in the following. 
2.2. Let us consider some important special cases. 
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I. Weierstrass’s singular integral. Here 


— 
K(u) = 


35 


and 


T[x; K(u)] = 


It follows that the Fourier transform of the composed kernel is 
1 1\ x? 
so that the kernel itself becomes 
Hence putting 


(2.21) W.[f] = x)du, 


we obtain 


1 1 1 
(2.22) 
a 


II. Poisson’s integral for the half-plane. Here 
K(u) = (1 + 
and putting 


f(u + x) 


2.23 P. = 


we get 
(2.24) P.[Palf]] = 


We note that this is essentially the same functional equation as that of the 
Weierstrass kernel. 
III. Picard’s singular integral. Here 


K(u) = 
and putting 


(2.25) = + x)du, 
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we obtain 
(2.26) (a? — = — [/]. 
IV. Dirichlet’s singular integral. Here 


sin 
K(u) = . 


TU 


Putting 


.27) = f(u + x)du, 
T 


we get 
(2.28) D.[Ds[f]] = D,[f], = min (a, 6). 


We note that this kernel does not satisfy either (K,) or (K2). This fact makes 
the investigation of the corresponding transformation much more compli- 
cated. 

Other examples of simple functional equations could undoubtedly be 
found in this connection. The importance of these four transformations is 
such, however, that a special investigation of their properties as revealed by 
the functional equations is warranted. This will be done below. 


3. THE POISSON-WEIERSTRASS CASE 


3.1. Equations (2.22) and (2.24) reduce to the common form 
by an obvious change of parameters. This equation is consequently satisfied 
by the two transformations 


flu t+ x) 


t, 


(3.13) WyLf] = f eM + x)du. 


This fact is undoubtedly well known to mathematical physicists.* P,[/] is 


* Several writers on the theory of the equation of heat conduction have observed such functional 
equations. P. Appell gave equation (3.11) for W,[ f] in Journal de Mathématiques, (4), vol. 8 (1892), 
pp. 187-216, p. 201. Cesaro, Académie Royale de Belgique, Bulletin de la Classe des Sciences, 1902, 
pp. 387-407, p. 392, noted that certain solutions form the elements of an Abelian group. G. Doetsch 
has produced a number of related transcendental addition theorems; see especially Mathematische 
Zeitschrift, vol. 25 (1926), pp. 608-626, p. 615. I am not aware of similar considerations having been 
made for Poisson’s integral. 
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the solution of Dirichlet’s problem for the upper half-plane corresponding to 
the boundary values f(x) on the x-axis, whereas W,[f] is a solution of the 
equation of heat conduction in one dimension corresponding to a given 
initial temperature f(x). These interpretations make equation (3.11) in- 
tuitively obvious. 

We choose for S(P) and S(W) the classes of measurable functions de- 
fined on (— *, ©) for which (3.12) and (3.13) respectively exist as proper 
Lebesgue integrals for every \>0. This choice is evidently in agreement with 
(S;), and a moment’s consideration will show that (Sz) is also fulfilled, and 
that (3.11) holds for any such function f(x). S(P) is simply the class of all 
f(x) such that f(x)/(1+?)eLi(— 2, ©). S(W) cannot be characterized in 
such simple terms. 

The transforms P,[f] and W,[f] are analytic functions of « and of \. For 
a fixed real x, P,[f] defines one analytic function of d in the right half-plane 
and another in the left, which are holomorphic in the half-planes in question, 
whereas W,[f| is holomorphic in the right half-plane and ordinarily does not 
exist in the left one. For a fixed positive \, P,[f] is an analytic function of x, 
holomorphic in the strip —\ <$(x) <\,* whereas W,|f] is an entire function 
of x. 

In the present case formula (1.51) holds in a sharper form, viz., 


(3.14) lim Py[f] = f(x), lim Wy[f] = f(x), 
A-0 


for almost all « whenever f(x)eS(P) or S(W). 
3.2. Problem (i) has a very simple solution in this case: 


TueoreM 3.2. If P,[f|=0 or W.[f]=0 for a fixed a, and feS(P) or S(W) 
respectively, then f(x)~0. 

This is pretty well known. A proof is obtained by observing that F.[/]=0 
implies F.48[f]=0 for every 8>0 by (3.11). F,[f] being analytic in \ must 
then vanish identically, and (3.14) shows that this implies f(«)~0. 

The same argument shows that unless fi(x)~fo(«). 

3.3. Let us now consider problem (ii). It is required to find whether, for 
a fixed a, the equation 


(3.31) Falf] =f 

can have any solution in S other than the trivial one, f=constant. If there 
exists such a solution f(x) then (3.11) shows that the corresponding trans- 
form F,[f] satisfies the equation 


* P[f] also defines two other analytic functions of x, one holomorphic above this strip, the other 
one below it. 
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= Aly] 


for every \. Hence F,[f] is an analytic function of \ with period a. From this 
point onwards the two cases must be treated separately. 

In the Poisson case we note that if f(x)eS(P) and x is fixed, then P,[f] 
=o(|d|). Hence, if P,[f] is periodic in \ with period a, it must be a constant 
with respect to X, i.e., P,[f] =f identically in \. But P,[/] is a potential func- 
tion for \>0, i.e., 


since P, [f] is independent of \. Hence 


— p,[f] =0, 

so that P,[f] =f is a linear function of «. But x is clearly not in S(P), hence 
fis a constant. Thus we have proved 


THEOREM 3.3. The only function in S(P) which is invariant under a Poisson 
transformation P, is f(x) =const., and this function is invariant under all such 
transformations. 


3.4. The Weierstrass case is rather different. We have seen that W,[/] 
must be an analytic function of \ with period a. Being holomorphic in the 
right half-plane, W,[f] must then be an entire function of \ as well as of x. 
We have consequently 


(3.41) Wi Lf] > 
Here the coefficients are entire functions of x which tend to zero faster than 
exp [—B|n| ], B arbitrary, as n—~, x being fixed. But W,[/] is a solution of 
the partial differential equation 

ow 
(3.42) 

Ox? or 
and we are clearly entitled to differentiate term by term in (3.41). It follows 
that 


(3.43) A!" (x) = (n =0,+1,+2,---). 
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— P,[/] = 0, 
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Consequently A(x) is a linear combination of 

(3.44) 1 and x, 

A,(x) is a linear combination of 

(3.45) exp [(2rin/a)/2(1 + i)x] and exp [(2rin/a)/2(— 1 — i)x] 

if n>0, and of 

(3.46) exp [(2xi| m| — i)x] and exp [(2mi| | /a)/*(— 1+ 


if n <0. It follows that the equation (3.31) has a continuum of solutions in 
the Weierstrass case. These solutions have a denumerable basis, viz., the 
functions of (3.44)-(3.46). Any linear combination of these functions, the 
coefficients of which satisfy the restriction of tending to zero faster than any 
function of ” of the form exp [—B|n|] as n—, assuming that there are 
infinitely many terms, is a solution of (3.31). 

These solutions are entire functions of x. Their rate of growth is subject 
to rather interesting limitations. Suppose that x is real and | f(x)| <Ae*!*!, 
where k is a positive constant. A simple calculation shows that for \=o+ir 


(3.47) WaLf]| + exp + 7°)/(4o) J. 


Suppose now that it is known that W,[f] has the period a. Then we have the 
same estimate if we replace \ by A\++na. Here we can choose m so as to mini- 
mize ((¢-+na)*+7?)/(¢+na). This minimum lies arbitrarily close to 2|7| 
if |r| is large. Hence for an f(x) which produces a periodic solution we can 
replace (3.47) by 


(3.48) | WaLf]| S 29/24 exp [&| x| + /2]. 


But this estimate implies that W,[f| is a rational function of w= 
exp [2xi\/a] with singularities only at 0 and , or more precisely, 
W,[f]=w-" Pen(w) where P2,(w) is a polynomial in w of degree <2n and 
n= [k*a/(4r)]. This result gives us additional information about the solu- 
tions of (3.31). It follows that any solution which involves infinitely many 
functions of the basis must occasionally grow faster than any function of the 
form e*!+! on the real axis. On the other hand, a simple calculation shows that 
if such a solution is an entire function of order two, it is of the minimal type 
of that order. Suitably chosen “lacunary series” in terms of the basis func- 
tions show that this estimate cannot be essentially improved upon. In the 
other direction we notice that the only solution which is at most of the minimal 
type of order one is Ax+B, and this is the only invariant common to all Weier- 
strass transformations. 
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It should be added that the preceding results also permit a complete 

determination of the solutions of the equation 

Falf] = Fels] 
in the two cases under consideration. The reader will have no difficulties in 
supplying the details. 

3.5. We shall now study the character of the deformation defined by 
F,[f]| in metric sub-spaces of S(K). We consider two sub-sets M(P) and M(W) 
of S(P) and S(W) respectively which we suppose satisfy postulates (M,), 
(M:) and (M;). In addition we shall require 

(M,) f(x)eM(K) implies | f(x)|eM(K), and the inequality | f(x)| <|g(x)| 
for almost all x implies ||f\| <\\g\\. 

A particular consequence of (M;) is that f(x) and | f(x)| have the same 


norm since | f(x)| <|If(«)!| and vice versa. 
An immediate consequence of (M;) together with the functional equation 


(3.11) is that 
(3.51) Fs[f]|| for 0< a < 8B, 
so that the transformation F,[/] is a steady contraction of the space M, and 


(3.52) 


It follows that lim... ||Fa[f]|| exists and is 20. If f(x)eL,(—2, ~&), 
1<p<~, or more generally, if 


1 1 
(3.53) lim — = 0, f | f(t)\ di S A, 
then 
(3.54) lim F.[f| = 0 


for all x by a theorem of N. Wiener.* 
|| [f]|| is a functional of f(x) and a function of X. For a fixed ) it is clearly 
a continuous functional of f(«) in M by virtue of (M;). Let us now consider 
its properties as a function of the real positive variable \ for fixed f(x). 
Formula (3.51) expresses that ||/,[/]|| is a monotone decreasing function 
of We have for 


and 


* See S. Bochner, Fouriersche Integrale, p. 30. 
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Hence if 
(3.55) lim |[Fs[f] — fll = 0, 


then ||F,[f]|] is continuous for every \=0, and the elements F,[f] form a 
continuous curve in M having f as one of its end points. On the other hand, 
if ||, [f]|| is not continuous at \=Ao, but has a jump j at this point, then 


lim — = 
h—0 


for every \, 0<A <Xo, and the distance from F.[f] to Fs[f] would be at least 
j if either a or B belongs to the range [0, Ao]. In particular, the distance be- 
tween f(x) and any one of its transforms must be at least 7. It seems difficult 
to exclude this possibility a priori, but we shall show that it cannot occur if 
F, be interpreted as P, or Wy. 

3.6. We have 


Prsalf] — Palf] = 


— + + kh) — 
n(X + h)(2A + a (u? + d*)(u? + (A + A)?) 


1 f(u + x)du 


Here we take norms on both sides, noting that the norm of a sum is not 
greater than the sum of the norms. In the second term we note that 
u?+(A+h)?=(A+A)?, and apply hypothesis (M,). Combining these steps we 
get for h>0 

2A+h 


A+A 


or 


If —A\<h<0O we get instead 


h h 


| 

| 
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In the case of W,[f] we have 


h 
W 
r+alf] + (A + ] 
u2/(A+h) — d 
A simple calculation shows that 
h 
| enw Oth) — ew | < 
Le(A + h) 


Hence we get for h>0 


or 


where we have used formula (3.51) in addition to hypothesis (M,). For 
—r<h<0O we get instead 
2| | 2\ hl 
(3.64) \| Wazalf] — < s 
A+h 
These formulas show that ||P,[f]|| and ||W,[f]|] are continuous families 
of continuous transformations in M(P) and M(W) respectively for \>0. 
If X=Ao>9, ||f|| SB, these families satisfy a Lipschitz condition of order one 
with respect to A, uniformly in \ and in f. It follows in particular that the 
monotone decreasing functions |{P,[f]|| and ||W,[f]]] are continuous for 
A>0. 
It is not possible to prove continuity at \=0 by these considerations. As 
a matter of fact we recall from the result of the discussion in §1.5 that (3.55) 
is not true for all metric sub-spaces of S. In particular, it was shown to be false 
in L, (—®, ©). 
3.7. Let us now consider the transformation 


(3.71) ER, 


where J is the identity. As a consequence of (3.11) we get 
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(3.72) Ex. Lf] = + £21) - 


It is easy to see that the operations E, and F, commute. Formula (3.72) is 
less useful to us than the mixed equation 


(3.73) = Alfl+ = + 
Using the first and the last member of this equation we get 


whence, by virtue of (M3), 


A particular consequence of this relation is that 


|| 2 


and this leads to the important conclusion that 
1 1 
(3.75) lim sup —-||Zalf]|| 2 
oh a 


for every fixed positive a. It follows that the degree of approximation of a 
function f(x) by its Poisson or Weierstrass transform is definitely limited 


to be of the first order at best. Indeed, if the limit on the left-hand side is 


zero, then ||£.[f]|| =0 for every a, i.e., f(x) is an invariant element of the 
space M under all transformations F,. These were determined in §3.3 for 
the Poisson case and in §3.4 for that of Weierstrass. We have consequently 
proved 


THEOREM 3.7. If f(x)eM(P) and 
1 
(3.76) lim —||P,[f] — = 0, 
h 
then f(x) =const. If f(x)eM(W) and 
1 
(3.77) lim —||W,[f] — f|| = 0, 
h 


then f(x) =Ax+B. 


It follows in particular that if M(P) or M(W) coincides with L,(— ~, «), 
1<p<~, then (3.76) or (3.77) implies f(x) =0. The theorem shows that an 
inequality of the form 


(3.78) > Ca 


i 

il 
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holds for every feM and for infinitely many values of h-0. Here C is a non- 
negative constant depending only upon f which equals zero if and only if 
f is invariant under all transformations F,. The estimates of §3.6 show on the 
other hand that the inequality (3.78) can be reversed for all those functions 
of the space M which are themselves transforms, i.e., which can be written 
as f=F,[g] with geM. It follows that in a space M whose metric satisfies 
the conditions stated in §3.5 the degree of approximation of a function f(x) 
by its Poisson or Weierstrass transform is at best of the first order with re- 
spect to a, except for the fixed elements, and that this order is actually 
reached for an infinite subclass of the space, namely by all the transforms. 


4. THE PICARD CASE 


4.1. We shall now take up for discussion Picard’s equation 


S(I1) is the class of all measurable functions f(x) such that (2.25) exists as a 
proper Lebesgue integral for every a>0. This assumption means that (S;) 
is satisfied, and it is easy to see that (Sz) is then also satisfied, and that (4.11) 
holds for any such function. 

The transform II.[f] is an analytic function of a, regular in the right 
half-plane. It can be shown that II. [/] is absolutely continuous and possesses 
a second-order partial derivative with respect to x for almost all x, and satis- 
fies the differential equation 


(4.12) = a*{Melf] — f} 
Ox? 


almost everywhere. 
It is well known that 


(4.13) lim Half] = f(x) 


for almost all x when {(x)eS(I1). 
4.2. Formula (4.12) gives us the following complete solution of problem 


(i). 

Tueorem 4.2. If I.[f]=0 for some a>0, then f(x)~0. 

The same conclusion can be drawn from (4.11) combined with (4.13). 
The same argument shows that II, [f:] =I, implies fi(x)~f2(x). 

4.3. The question of invariant elements is also easily answered. Suppose 
that for some a>0 


(4.31) = f. 
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Formula (4.32) then shows that 


2 


—II,|f| = 0. 
Ox? | 


Hence we have proved 


THeEorEM 4.3. The only functions in S(Il) which are invariant under a 
Picard transformation Iq are the linear functions, Ax+B, and these functions 
are invariant under all such transformations. 


We recall that it was shown in §1.3 that the linear functions are left in- 
variant by every transformation K,[f] whose kernel is an even function and 
which satisfies (Ki) and (K:2). The Picard transformation has consequently 
no other invariant elements than those common to this class of transforma- 
tions. 

The equation 


(4.32) Half] = 
can be treated in the same manner. Together with (4.11) it implies 


11. —f] =0, 
whence II,[f] —f=0, and f(x) =Ax+B. 
4.4. Let us now consider a linear sub-space M(I1) of S(IL) in which we 
introduce a metric subject to postulates (M,), (Mz) and (M3). Note that 
(M,) is not assumed. We shall show that (M3), i.e., the assumption 


(4.41) 

for every a>O, implies that 

i.e., the analogue of (3.51). We can write (4.11) 


1 


This gives 


lane + (6 — 


It follows that ||II. [f]|| is a monotone increasing function of a, and 


H 
| 
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(4.43) 


In particular, ||IT.[f]|| tends to a finite limit >0 as a—0. The transforma- 
tion II, [f] is ordinarily not defined for a =0, and need not tend to any finite 
limit as a—0, as is shown by the simple example II. [x?] =x?+2a-*. On the 
other hand, if the mean value of f(x) over the range (—T, T) is uniformly 
bounded with respect to 7, and tends to a finite limit M[f] as T+, then 
by Wiener’s theorem 


(4.44) lim Nalf] = 
a—0 


for all x, uniformly over any fixed finite interval. But it is obvious a priori 
that this result does not enable us to draw any conclusion regarding the 


numerical value of lim. IT. 
4.5. The continuity properties of the Picard transform are on the whole 


simpler than in the Poisson-Weierstrass case. We can rewrite (4.11) in the 
form 


a — 


(4.51) Nalf] — = ; 


Putting 
(4.52) Half] = f — Ulf], 


we get 


a — 


This relation leads to the inequalities 


(4.53) Half} — = 


a 


< \(=)- 


a 2 
since obviously 


(4.55) 


Since Ig and H, commute, we have also 


| 
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a 


It follows from these inequalities that II, [f] regarded an an element of 
M(I1) is continuous with respect to a, 0<a<, and that |{II.[f]|| is a con- 
tinuous function of a in the same range. We do not have continuity at either 
zero or infinity except in special cases. Formula (4.42) expresses the fact that 
||11.[f]|| is an increasing function of a. The rate of growth is limited by the 
inequality 

B 2 


Qa 


which is a consequence of (4.54). 
4.6. Let us now consider the transformation H.[f] in more detail. It 
satisfies the functional equation 


(4.61) (a? — = — 
and the mixed equations 

(4.62) o?(H.[f] — Hs[f]) = (6? — 
(4.63) = [Hs[/]]. 
Suppose that a<f. Then (4.62) gives 


(a? + — 
so that 


(4.64) < «<8. 
This inequality states that 


is an increasing function of a. Hence it can tend to zero as a— oo if and only 
if it is identically zero, i.e., if and only if f(x) is invariant under all Picard 
transformations. We have consequently proved 


THEOREM 4.6. If f(x)eM (II), and 
(4.65) lim — = 0, 


then f(x) =Ax+B. 


It follows that there exists a non-negative constant C for every {(x)eM (II) 
such that 
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for infinitely many values of a—«, and C=0 if and only if f(x) =Ax+B. 
Hence in a space M(II) whose metric is subject to the restrictions stated 
above, the degree of approximation of f(x) by its Picard transform is of the 
second order with respect to 1/a at the best. This order is actually reached, 
however, namely by all elements which are themselves transforms of ele- 
ments of M, i.e., for every g=IIs|f]. Indeed, formula (4.63) tells us that 


The right-hand side does not exceed 26%||f|| independently of a. Hence the 
left-hand side remains bounded as a, i.e., 


This proves the assertion. 
5. THE DiIRICHLET CASE 


5.1. The kernel in the Dirichlet case differs fundamentally in some re- 
spects from the kernels in the cases which we have discussed so far. ‘Thus 
it satisfies neither (Ki) nor (Kz). One is constantly hampered by these de- 


fects when trying to extend the preceding theory to the Dirichlet case. The 
difficulties start right at the beginning, viz., with the determination of S(D). 
It is by no means sufficient that (S;) is satisfied in order that (S:) be also satis- 
fied as well as the functional equation 


(5.11) D.|Dalf|| = D,[f], = min (e, 8). 


Both the originators of the zero element and the invariant elements form lin- 
ear manifolds which are difficult to characterize. Finally if we come to the 
question of metric sub-spaces M(D), it turns out that (M3), which was basic 
in the previous discussion, is no longer valid in the cases of main interest. 
The only instance to which our methods obviously apply is the space L2(— « , 
co). Here the transforms exist, belong to the same space, and satisfy (5.11). 
Problems (i) and (ii) can be completely solved. The space is metric and (M3) 
holds. It is not possible to extend all of what we are doing to the case L,(— ~, 
0), p#2, but we shall note below what results are valid in the more general 
case. In view of this situation the space will be taken to be L.(—*, ~) 
unless otherwise stated. 

5.2. The solutions of problem (i) can be obtained by the method of 
Fourier transforms along the lines given in §1.2. We have 
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1 
(5.21) K(u)] = { | >1, 
(2m)-"? for |x| <1. 


Hence we are confronted with case (2) in the notation of §1.2. It follows that 
a necessary and sufficient condition that 


(5.22) D.\f| = 0, fels, 
is that 


(5.23) = L.i.m. {f+ 


where F(z) is an arbitrary function in Zz. The set M of all such functions 
f(x) is obviously a linear manifold in Le. 
5.3. The same method applies to problem (ii). Suppose 


(5.31) Dalg| = g. 

We have again case (2) of §1.3. It follows that a necessary and sufficient 
condition in order that g(x) shall satisfy (5.31) is that 

(5.32) g(x) = f e*"G(u)du, 


where G(z) is an arbitrary function of Ze. The set of all such functions forms 
a linear manifold §. We note that § and M are orthogonal complements of 
each other in Le, since 


The discussion and results of §§5.2 and 5.3 extend without difficulty to 
the case in which we replace Zz by L,, 1<p<2. The case p=1 is not acces- 
sible because D.[f] need not be in LZ; when feZ;.* In case p>2 the method 
breaks down because the method of Fourier transforms fails. 

5.4. Supposing f(x)eZe, let us put T[x; f]=F (x). A simple calculation 
shows that 


—@ 


We have consequently 


* For the properties of Da[f] in /4(— ©, ©), 1< p< ©, see E. Hille and J. D. Tamarkin, Bulletin 
of the American Mathematical Society, vol. 39 (1933), pp. 768-774. 


151 


152 EINAR HILLE 


or 


(5.43) 
so that (M;) holds. We then get from (5.41) that 


It is obvious that ||D.[f]|| is continuous, and 


(5.45) lim = 9, tim = 


Further, {D.[f]} is a continuous family of continuous transformations de- 
fined over Lz. Let us put 


(5.46) E.lf] = f — Delf]. 


E,[f] is also in LZ». Its Fourier transform equals F(x) for |x| >a, and zero 
for |x| <a. Hence 


(5.47) = \fo+f} | F(u) 


It follows that 
(5.48) |Za[f]|| = «<8, 
(5.49) lim || Z.[f]]| = 0. 


It is clear that (5.49) does not hold uniformly for all f(x) having norms under 
a fixed bound, nor is it possible to assign any limits one way or the other to 
the degree of approximation with respect to 1/a. 

Formula (5.41) remains valid for f(x)eZ,, 1<p<2, but while it is true 
that D.[f] is a bounded transformation in Ly, it does not seem likely that the 
bound should be equal to unity. It follows that (5.44) is likely to be false for 
p#2. 

5.5. Let us define 


(5.51) D.\|f| = 0, a <0. 


The family of transformations D, is then defined for —x <a<@, D, is 
zero for a <0, tends to the identity as a— ©, and is continuous for all values 
of a. These properties together with formula (5.11) express the fact that 
| Dalf]} is a family of projection operators forming the resolution of the identity. 


[January 
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of a self-adjoint transformation H, in the terminology of J. von Neumann and 
M. H. Stone. We shall show that 
(5.52) = f(x) ~ f(s), 


where 
1 du 
(5.53) F(x) = ——P.V. f F(u + x) —> 
u 


and P.V. denotes that the Cauchy principal value of the integral is to be taken 
at w=0. We recall that F(x) exists for almost all x and is in Lz if F(x) is in 
I. In the following f(x) is an absolutely continuous function in L. whose 
derivative, f’(x), is also in Zz. We have 


T f’] = — isgn aT [a; =| «| T[a; f], 
= iaT la; f] =|a| T[a; f]. 
These relations also prove the equivalence of the conjugate of the derivative 


and the derivative of the conjugate function. With the usual notation for the 
inner product, and assuming g(x)eL, 


(5.54) 


= (lal = f | «| 


on fea = g) 


by formula (5.41). It follows that 


This relation proves formula (5.52). 


YALE UNIVERSITY, 
New Haven, Conn. 


] 


STEREOGRAPHIC PARAMETERS AND PSEUDO- 
MINIMAL HYPERSURFACES* 


BY 
OTTO LAPORTE anp G. Y. RAINICH 


INTRODUCTION 


This paper consists of two parts: in the first we discuss a general method 
of representation of a hypersurface by means of a special system of parame- 
ters, 41, - - , %,, Which we shall call stereographic parameters. In §1 we show 
how a hypersurface (n-dimensional in Z,,:) is expressed in terms of a single 
function of the parameters which is closely related to a quantity introduced 
by Painvinf whose value gives the distance of the tangent £, from the origin 
of the coordinate system. Next the coefficients of the first and second funda- 
mental forms (¢g;, and /;,) are obtained and shown to be related in the follow- 
ing manner: 


where 2A = 1+2,«,. This relation suggests that it would be especially simple 
to determine the surface by giving the /’s. In fact (see §2) this problem, which 
in general requires the integration of a complicated system of partial differ- 
ential equations, here reduces to quadratures (integral formulas (2.2)). The / 
tensor must obey certain integrability conditions, the Codazzi equations, 
which take a surprisingly simple form since they are free from any gix. In 
the case of a two-dimensional minimal surface these equations are the 
Cauchy-Riemann equations and the integral formulas reduce to the Weier- 
strass representation of a minimal surface in terms of an analytic function. 
It becomes clear that the ideas underlying the Weierstrass representation are 
not limited to surfaces of special curvature properties; in particular we show 
that the real reason of the Weierstrassian theorem about algebraic minimal 
surfaces is a trivial one, for it lies in the choice of parameters; therefore it is 
obvious how to generalize it to any hypersurface (§3). Since the stereographic 
parameters depend upon the coordinate system in the £,4; our next task 
(§4) is to investigate the resulting arbitrariness. We find that the situation is 
governed by a certain m(w+1)/2-parameter group 2 which is a subgroup of 
the conformal group in the space (*, - - - , x,), and which is induced by the 
rotation group in the £,4:. Because of application in §10 the infinitesimal 


* Presented to the Society, December 27, 1934, and April 19, 1935; received by the editors April 
17, 1935. 
ft Journal de Mathématiques, (2), vol. 17 (1872), pp. 219-248. 
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operators arising from Q are derived. Our special choice of stereographic 
representation results in a restricted form of tensor analysis corresponding 
to 2 (§5). The criterion that an expression or relation involving our represen- 
tation have geometrical meaning is invariance under @. As the simplest in- 
variant relation appears 


whose geometric meaning is that the sum of the m principal radii of curvature 
vanishes (§6). 

The second part of the paper is devoted to the study of the hypersurfaces 
thus characterized, which we shall refer to as pseudo-minimal hypersurfaces. 
In view of the connection between these surfaces and analytic functions es- 
tablished by the Weierstrass formulas, the following developments may be 
considered as a generalization of the theory of analytic functions to higher di- 
mensions. The requirement /,,=0 results in a differential equation for the 


scalar function ¢: 
dg 
— a2,— + = 0. 
OX,OX, OX, 


Any pseudo-minimal hypersurface can be obtained from a solution of this 
equation in the following way: 


N OX,OX, OX; N OX,O%>, 


For x =2 this gives a representation of minimal surfaces different from that 
of Weierstrass. The /;, are also given by very simple formulas in terms of @ 
(§7). We proceed to the integration of the above differential equation using 
a method analogous to that of separation of variables in mathematical physics 
by making the formal “Ansatz” 


@ = , 
Here H, is a homogeneous polynomial of degree /; f a function of r. It then 
turns out that 7, must satisfy the n-dimensional Laplace equation, and f; a 
hypergeometric differential equation. In the next section (§8) the solutions 
of this equation are studied in detail. Due to the appearance of integer and 
half-integer arguments a, 8, y, exceptional cases occur, and the solutions turn 
out to be Jacobi polynomials in many instances. 

Interesting and important are perhaps the cases of centrally symmetric 
solutions which give rise to axially symmetric hypersurfaces (§9). For 1 =2, 3, 
and 5 these are actually obtained. We note here the unexpected result that 
for n=3 the hypersurface is generated by “rotating” two-dimensionally a 
parabola around an axis. 


= 0, 
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The conclusion briefly touches upon the following topics: development of 
a general solution; explicit connection with harmonic functions for even ; 
obtaining particular solutions by applying infinitesimal operators to the cen- 
tral symmetric solution; hypersurfaces of constant sum of radii. 

The literature on hypersurfaces can be found in a book by Struik.* The 
integral representation of §2 was indicated first by one of us.f The representa- 
tion in terms of the potential was found during the course of the investiga- 
tion. Although related potentials in three dimensions had been proposed be- 
fore (Painvin,t Minkowski,§ Blaschke\!), their use in connection with 
stereographic parameters, which results in the simple representation of §1, 
seems not to occur in the literature. 

It appears likewise that the special class of hypersurfaces here considered 
never has been treated in the literature and that no examples of them are 
known, although the invariant whose vanishing characterizes them can be 
found expressed by means of other parameters in Forsyth. 


Part I 
1. STEREOGRAPHIC PARAMETERS 


It is convenient to begin the discussion by considering a surface in ordi- 
nary space and then generalize, introducing index notation. 

Denote by X, Y, Z the coordinates of a point of a surface, by é, n, ¢ the 
components of a unit normal vector at that point, and by x, y the coordinates 
of the stereographic projection of the point &, 7, ¢, which moves on a unit 
sphere when X, Y, Z move on the surface. If the surface is not developable, 
X, Y, Z will in general be functions of x, y which we shall call “stereographic 
parameters.” From now on we shall consider X, Y, Z and &, n, ¢ as functions 
of x, y. We shall write 


(1.1) r? = a® + y?; A 


We have then 


x= ——; 
* 


y 
1.2 


* D. J. Struik, Grundziige der mehrdimensionalen Differentialgeometric, Berlin, 1922. 
Also A. R. Forsyth, Geometry of Four Dimensions, Cambridge, 1930. 

t G. Y. Rainich, Comptes Rendus, vol. 180 (1925), p. 801. 

t Loc. cit. 

§ Mathematische Annalen, vol. 57 (1903), pp. 447-496. 

|| Vorlesungen tiber Differentialgeometrie, 2d edition, vol. 1, 1924, pp. 85, 135. 

{| Loc. cit., p. 42 of vol. IT. 


1+? 1 — 


1936] PSEUDO-MINIMAL HYPERSURFACES 157 

We have the following relations, which express the fact that é, n, ¢ is 
normal to the surface: 

Ox Ox Oy oy oy 

We now introduce an auxiliary quantity ¢ which for reasons which will 
be clear later we shall call the potential, viz., 
(1.4) Xx+ Vy = + 


The quantity p=Xé+Yn+2Z¢ seems to have been introduced by Pain- 
vin.* Minkowski’s Stiitzfunktion is also closely related to it. Geomet- 
rically » means the distance from the origin of coordinates to the tangent 
plane at the point considered. We note here for future reference the relations 


(1.5) 


Returning to the potential ¢ we shall show that X, Y, Z may be expressed 
in terms of it and its derivatives with respect to x and y. Denoting differentia- 
tion with an index, we have from (1.4) 


—Zz. 
The first three terms vanish because of (1.3) and we are left with 
(1.6) o,=X—-—Zx, 
and in the same way we obtain 
(1.6’) ¢,=VY—Zy. 


We substitute X and Y from these relations into (1.5) and obtain, taking 
into account (1.1), 


1 
(1.7) i= — — ¥dy). 


Using this value in (1.6) and (1.6’) we have 


1 
+ (A — — xydy], 


1 
Y= be — + (A — 


* Loc. cit. 
Tt Loc. cit. 


1 
(1.7’) 
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The formulas (1.7), (1.7’) give a representation of any non-developable 
surface. The formulas we have obtained generalize without difficulty to an 
n-dimensional hypersurface in an (w+1)-dimensional euclidean space. We 
use index notations, replacing +1 as an index by 0. The indices a, b, ¢ will 
range from 0 to , and all other indices i, 7, k, etc., from 1 to m. Greek indices 
a, B indicate summation from 0 to m, and other Greek letters summation from 
1 to n. Although we use index notation and omit summation signs, we do not 
use orthodox tensor analysis. Quantities such as \, 7 are not scalars, in the 
sense that they are affected by transformation of coordinates in E,4:; but 
they are invariant under rotations in m-space, and therefore as long as we keep 
the same m-space, distinction between covariant and contravariant quantities 
is irrelevant and we make all our indices subscripts. We shall discuss these 
questions in detail in §5. 

We write now the extensions of the above formulas giving to the new 
formulas the same numbers with stars: 


C3... 2h = 1 + + =A; 


= ; 
OX « 
Ox; 
& = + 


1 
Xo = > — Xi = oi + — 


In these formulas differentiation of @ with respect to x; is indicated by 
affixing the index i. 
Our next task is to compute in our parameters the coefficients of the funda- 
mental forms. We begin by calculating the derivatives 
— = + — — — ——| + — (6 — |. 
dx, OX 


It is convenient to introduce the notation 


5; 
(1.8) = (@ dete). 


We can write then 


(1.4*) 
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(1 9) OX; (s 
Ox, 


We next calculate 


Xp 
(1. 10) ax + (¢ 6.1) |. 
k 


r 
Thus we have for the metric tensor 
OX, 
or, using the definition of X, 
(1.11) Sik = 


For the calculation of the coefficients of the second differential form we 
use the formula (see, e.g., Forsyth, vol. 2, p. 348) 


OXq Ika 
a= 


OX jOX;, Ox; OX, 
In our case it gives 


do 


But differentiating (1.2*) we have 


Oi Vix: 


Together with (1.10) and (1.9) this gives 


(1.12) lix = 
It should be noted that in our system of representation the /’s are simpler 
than the g’s. Combining the formulas (1.11) and (1.12) we obtain 


(1.13) Sik = px. 


2. GENERALIZED WEIERSTRASS FORMULAS 


In the classical theory of surfaces it has been known for a long time that 
a surface may be entirely determined, except for its position in space, by giv- 
ing the E, F, G, L, M, N as functions of the parameters (O. Bonnet). With 
our choice of parameters the situation simplifies considerably, because if the 


XX, 
Ox; Ox, Ox; Ox, 
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1;, are given the g;, may be considered known as well (1.13), so that the giving 
of the /;, determines the hypersurface (it is not surprising that we need here 
fewer functions because our system of parameters does not involve arbitrary 
functions). In the classical theory the surface can be obtained from the funda- 
mental quantities by integrating a system of differential equations. Here we 
can reconstruct our hypersurface from the /;, (or the a;,) by quadratures. 
The coordinates X, of the hypersurface may namely be obtained from (1.9) 
and (1.10) by curvilinear integration. We thus obtain the formulas 


Ape 
(2.1) fan — dx; Xo = -f 


If we introduce /;, we have formulas which give everything directly in terms 
of the coefficients of the second differential form: 


(2.2) X; = — é;,)dx,; Xo = 


As we shall see a little later, these formulas may be considered as generaliza- 
tions of the Weierstrass formulas for a minimal surface. 

Of course, the a’s or /’s in these formulas cannot be given arbitrarily; 
they must satisfy certain differential equations which may be obtained as in- 
tegrability conditions for (1.8), i.e., as conditions on the a’s that it should be 
possible to determine ¢ from (1.8). Introducing the notation 

— 


2.3) =F, 


we have from (1.8) 

(2.4) = or ain — diel = dix. 

Since the right-hand sides are second derivatives, the a;, and F must satisfy 
the relations 


(2.4’) 


which give, for 7, 7, & all different, 
0a; 0a; 

(2.5) 

Ox; Ox; 


Since j and & must be distinct (otherwise (2.4’) are satisfied identically) 
we may assume now 1=/+k and have 


5 OF 04;; 5 OF 

—  — 6, — = — 
Ox; Ox; Ox, Ox, 
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and if (2.5) and (2.6) hold it is easy to show that an F exists which together 
with ¢ satisfies (2.4). 

So far we have not taken into account the relation (2.3). Writing it in the 
form 


AF + — = 0 
and differentiating, we obtain 
«iF + dicts + — = 9, 
where the last two terms cancel. On the other hand, from (2.4) we have 
Together with the last relation this gives 
OF 


\— + = 0. 
Ox; 


Differentiating (2.4), contracting in two different ways, and subtracting, we 


have 

OF 
= (n — 1)—- 
Ox; OX, Ox; 


Now we can eliminate F and obtain 


Odie 


- n — 1)diex, = 0 
which may also be written as 


Ox; 


(2.7) 

We omit the proof that (2.5), (2.6) and (2.7) constitute also sufficient con- 
ditions for (1.8). This system can also be obtained from, and regarded as in- 
tegrability conditions for, the line integrals (2.1), and was so obtained origi- 
nally (contributions to this work by Mr. J. L. Coe and Dr. B. C. Getchell 


| 
Ox; Ox, 
From this it follows that, for z, k, / all different, ’ 
| 
ive 
Hi 
id 
4 
| 
AG 
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in connection with a seminar in differential geometry in 1934 are acknowl- 
edged here), but it seems that an early introduction of the potential ¢ per- 
mits an easier approach. 

In terms of the /;;= —A-'a;; we obtain in place of (2.5), (2.6), and (2.7) 

OX, Ox; ; OX; 

These relations constitute the Codazzi equations in stereographic parame- 
ters. In the classical theory the Codazzi equations involve the /’s and the g’s. 
Here, since the g’s are functions of the/’s, the latter alone appear in the equa- 
tions. Any hypersurface may be given by /’s which satisfy these equations. 
It might also be mentioned that the Gauss equations are satisfied identically. 

For n =2 only the last set of (2.8) remains, because in the other equations 
all three indices must be distinct. We have thus 


OX, Ox; 


or, going back to ordinary notations, 


dL 
+ 


ar(L+N) 
+ 


Ox ay Ox Ox oy oy 


or 
OM AN OM OL 
(2.9) -—)- x(L + N), a( 
oy Ox Ox dy 
These equations were given in the paper referred to on page 156. 
For a minimal surface (as we shall see in $6) we have the relation 
L+N=0. 
Denoting M by u and L= —N by 2, (2.9) become in this case the Cauchy- 
Riemann equations, so that u+iv =w is an analytic function of s=x+iy. An 
easy calculation shows that the formulas (2.2) become in this case the Weier- 
strass formulas. 
3. REPRESENTATION OF ALGEBRAIC HYPERSURFACES 


Weierstrass* proved that his representation of minimal surfaces possesses 
the property that every algebraic minimal surface is given by an algebraic 
analytic function, and conversely that to every algebraic analytic function 
there corresponds an algebraic minimal surface. Now our representation is a 


* See Bianchi, Lezioni di Geometria Differenziale, 3d edition, vol. I, 1922, p. 540, chapter 12, §204. 
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generalization of Weierstrass’s in two respects: it does not presuppose any- 
thing about the curvature as is the case for minimal surfaces, and it holds 
for any hypersurface of m dimensions in an (w+-1)-space. It is therefore natu- 
ral to inquire to what extent the Weierstrass theorem can be generalized. 
The result of this consideration is the following 


THEOREM. For a hypersurface to be an algebraic hypersurface it is necessary 
and sufficient that the potential $ in 


x;y 1 
Xi; = XY (xo, — Xo = — 


be an algebraic function of x1, + - - , Xn. 


That this is a sufficient condition is immediately seen from the above 
formulas, for differentiation cannot introduce a transcendentality and thus X; 
and X» will be algebraic functions of the x; and of each other if ¢ is an alge- 
braic function of x;. The important part of the theorem is that conversely we 
obtain all algebraic hypersurfaces from algebraic functions ¢. We therefore 
proceed to find the function ¢ belonging to any algebraic surface and to show 
that it is also algebraic. 

If the surface X» = Xo(Xi, - - - , X,) is algebraic then certainly the direc- 
tion cosines of the normal £; are algebraic functions of the X) and X;. But since 
for stereographic parameters 


the relation between the & and £; and the X, and X;is also algebraic. We can 
therefore express X, and X; as algebraic functions of x;. Introducing these into 


+ (A 1)Xo 
we obtain ¢ as an algebraic function of «;. Thus the theorem is proved. 


4. GROUP PROPERTIES 


The formulas we have obtained are based on the use of a rectangular 
cartesian coordinate system in Z,,;. We wish to see now how they are affected 
by a transformation of these coordinates. Consider first a translation; for 
n=2 we have 


X’=X+Ah, y’=Y+, Z=Z+1. 


Substituting in formula (1.4), since x, y, w are not affected we obtain for the 
new @ 


=othx+ky th. 


> 


Bi 
ay 
| 
| 
i 
ims 
} 
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For a general x this becomes 
(4.1) = + hom. 

Now consider a change of the axes with the origin preserved. &, n, ¢ and 
X, Y, Z will be changed, but the scalar product of the vectors of which they 


are components will not be affected. The change in @ will therefore come from 
\. Let the transformation formulas be 


(4.2) fa = Sapte 
with the orthogonality condition 
Sapsca = 


This comprises both rotaticns (determinant +1) and rotation-reflexions (de- 
terminant —1). Denoting by x,’ the quantities which in the new system corre- 
spond to x; we have 


1 + 1 + Sopto + D 


(4.3) x; 


where 
(4.4) Nj = SipX% + Sion, D =X + + Soon. 


These formulas include inversions (transformations by reciprocal radii) for 


= soo = 


The substitutions (4.3) constitute a group which we will call 2. It should 
be noted that in spite of the linear appearance of these formulas they are 
essentially quadratic. For »=2 using complex numbers x;+7%2 we obtain a 
subgroup of the group of fractional linear transformations where the essen- 
tially quadratic character of the transformations is masked by the use of 
complex division. 

We shall have to use in what follows an expression for 


N= + xf) 


which it is easy to calculate from the above formulas. Since we already have 


A= 1/(1 + £o) 


we have now 


1 1 d 
1+ x D 
+ Sop — + S00 — 
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or 
(4.5) 

D 


Of course, we have also 


(4.5’) 


where D’ is the denominator of the inverse transformation, or 
(4.5’’) D! = ’ SpoXp + Soop’. 


Now that we know how ) is affected by our transformations, we can find 
the law of transformation for ¢. If we denote the potential in the new coordi- 
nate system by $’(m’, - - - , Xn’) =’(x’) we have, on account of (1.4), 


¢'(x’) = ~ 


where we have to consider \ as a function of the x’, i.e., 


2 [ 
1 + x,(«’)x,(x’) 


Using again (1.4) and (4.5’) this may be written as* 

1+ x,’x,’ 

$'(x’) = 9[x(x’)] = D’o[x(z’)], 
1 + 


or introducing a symbol T corresponding to a transformation of coordinates 
(4.2), 


Sip pe + Si 
(4.6) To = (Sop%p + Soom + = | 


SopXp + Soo + A 


The transformation formulas for the quantities /;, (or ai) and gi are 
easily obtained, but they will not be used in what follows. 


* It is interesting to compare this situation with that of the harmonic functions (let us say, to 
fix the ideas, in three dimensions). It is well known that the substitution x/r?, - - - for x, - - - does 
not carry a harmonic into a harmonic function, but that 


1 7 
53 


is harmonic. We get a clearer insight into the situation if we remark that the class of functions 
\/2H(x, y, z) is invariant under the reciprocal radii transformations as well as under © of which the 
former are special cases. For a general m the exponent of d is n/2—1. 


— 


} | 
d’ 
dj 
i 
ae 
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Although our situation is slightly different from that usually considered 
in the Lie theory, in that ¢ is transformed not only by substitution but also 
by multiplication, the idea of infinitesimal transformations and the rules for 
obtaining them are essentially the same. We consider a one-parameter group 
T(h) (with canonical parameter /) contained in 2. This makes the parameters 
of the group, in our case the s;, functions of 4, so that we can write 


T(h) = T[si(h)]. 


The corresponding infinitesimal transformation is given by 


OT¢ ) (=) (= ) 
4.7 = 
(4.7) ( Oh J Oh \ War 


In applying this general rule to (4.6) we must note that, for h=0, T becomes 
the identical transformation, so that (s.5),;-o=6.s and that the (0s../dh)> are 
the components of an antisymmetric tensor. Therefore (4.7) may be rewritten 
as 


(4.8) (=) (=) ( - ) = (=) Mar, 
oh 0 acb oh 0 OSba / 0 acb oh 0 


where the first factors are independent quantities, #(7+1)/2 in number, so 
that the M,.» constitute the infinitesimal transformations. Computing first 
the case where none of the indices is zero, we have 


OSik OS 0 OX; OX, 


These are, as was to be expected, the familiar infinitesimal rotations in 
the x-space; they correspond to rotations which leave the X> axis invariant 
in the X-space. More interesting are the 


(4. 10) M (uber + a — xo = Od. 


T 


The first part of these operators involving the derivatives corresponds to 
rotations of (7+1)-space in the m coordinate planes containing the X> axis. 
In the vicinity of the origin they approximate displacemenis along the 
%1, °°, %, axes.* It is therefore natural to compare the group 2 generated 
by the Q’s and M’s with the group of rigid motions in n-space. 

Since Q is isomorphic (in the narrow sense) to the group of rotations in 


* The term —x4¢@ in (4.10) is due to the iact that ¢ differs according to (1.5) by a factor \ from 
the true scalar p. 


1936] PSEUDO-MINIMAL HYPERSURFACES 167 


(n+1)-space, the commutation relations must be the same as those of the 
latter group. They may be written as 


(Mav, Mec) = Mec. 


All other M’s exchange. In our notation, giving preference to the Xo axis, 
this becomes 
(Mii, Mix) = Mix, 
(4.11) (M;;, Qi) = Qi, 
0;) = — Mij. 


These formulas will be used in §10. 


5. TENSOR-ANALYTICAL ASPECTS 


The quantities x; may be considered as parameters of the hypersurface or 
as Gaussian coordinates on it; the quantities /;; are the components of a tensor 
and so are the quantities g;;. In dealing with these tensors we do not employ 
general coordinates, but only the special systems obtained by stereographic 
projection. When we change our coordinates we always pass from one special 
coordinate system to another. The general rules of tensor calculus apply, of 
course, in all cases. However, due to the special character of the coordinates 


used it did not seem advantageous to employ the entire apparatus of tensor 
analysis. 
The most striking peculiarity is that for a tensor f;, the equation 


(S.1) faa = O 


is invariant under our transformations. In order to explain the situation we 
have to make some remarks which are almost trivial. 

When we speak of tensors we have in mind quantities which obey known 
laws of transformations; here we are interested only in transformations corre- 
sponding to the coordinate changes just mentioned. As to formation of in- 
variants we must remark that we can use any tensor as a “metric tensor,” 
define raising and lowering of indices with respect to it, and apply this opera- 
tion in order to obtain invariants of other tensors by means of contraction. 
A better way of expressing this is to say that covariant tensors have no in- 
variants, but that there exist simultaneous invariants of two tensors, and 
contracting /;; using the metric tensor g;; is simply a method of building these 
simultaneous invariants. Ordinarily a definite tensor g;; is singled out by the 
nature of the problem, and it is used exclusively for the formation of in- 
variants with other tensors. However, if we apply the same transformation 
of coordinates to two spaces, the metric tensor of each may serve to form in- 


RS 


+L 
4 
t 
aa 
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variants with the tensors arising in the study of the other; and in practice we 
often would use the simplest of these. 

In our case the simplest seems to be the metric tensor of the unit sphere 
which we denote by G;;. It may be calculated by use of the formulas 


Ika 


7 OX; Ox; 


ij 


and (1.2*), from which we obtain 


Ox; 
so thatT 


(5.2) 


We find without difficulty 

(5.3) G* = Nix, 

and we may use this to form invariants according to what was said above of 
any tensor referring to any hypersurface. We shall avoid raising indices, and 
the only place where we use superscripts will be in metric tensors where upper 


indices simply denote the inverse of a matrix; i.e., m*/ denotes a matrix such 
that 


= 
Given any tensor /;; and any other tensor m;;, we know that 


¢ This formula (5.2) permits us to use the tensor character of G to derive rapidly the conformal 
character of our transformations. The transformation formula 


or, using (4.5), 


which shows that the matrix 


differs from an orthogonal matrix by a numerical factor D; this is another way of saying that the 
transformation we are considering is conformal. 


vj 
A 
Six 
Gu = <=. 
Ox, 
Ox," Ox;' 
becomes, using (5.2) and an analogous formula in the other coordinate system, 
5i; Spe 
dx, 
Ox, Ox, 
Ox,’ Ox;' 
Ox,’ 
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is an invariant. In particular 


is an invariant, and using the above expression for G‘* we find that 


(5.4) 


is an invariant. 

Furthermore, if two tensors ¢;; and sy, are given, of course /;;sp, is a tensor 
of rank four, and from it we may obtain a tensor of rank two by contracting 
with respect to any third tensor mj, so that 


mt 
is a tensor. Using G for m we have that 
NtipSpk 


is a tensor. As an example we may mention formula (1.13) which gives the 
metric tensor g of a hypersurface in terms of the /’s of that hypersurface. It 
becomes clear now why #,,=0 is an invariant equation. It is because \*f,, 
is an invariant. We have 


= 


or, according to (4.5), 


ii 


bop = Dtyp. 


The result of summing with respect to two lower indices is therefore in our 
case a relative invariant. Equating a relative invariant to zero we obtain, of 
course, an invariant equation. The situation we have here is intermediate be- 
tween that of general tensor analysis and the special case when we consider 
only rectangular cartesian coordinates. In the latter case we may, of course, 
use 6;; as a metric tensor, and we may form invariants (they will be invariants 
only under orthogonal transformations) by summing lower indices, and, in 
general, we do not have to bother about the level on which the indices are. 

Of course it is permissible in considering a hypersurface to use in the 
orthodox way the g’s of that hypersurface for raising indices and contracting. 
For that purpose we need the g‘*. In our notation /‘* is simply a matrix such 
that 


Lil = 


The elements of that matrix are the (n—1)-row minors of the matrix J; 
divided by the determinant /. It is easy to see because of (1.12) that 


169 | 
i 
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gi* = 


Now we can contract any tensor; applying this in particular to /;, we obtain 


which means the sum of the diagonal minors of /;, divided by the determinant 
of the /’s and by 2’. This is an invariant which will appear again in the next 
section. On the other hand we may obtain an invariant by contracting /** 
by means of gx. We obtain 


which is the invariant (5.4) obtained above. 

We come now to differential invariants. Here, as before, it seems impor- 
tant to emphasize the point that all such invariants must be considered as 
simultaneous invariants of the given tensor and of the fundamental tensor, 
and that we may use any tensor as the fundamental tensor. It is natural again 
to use G;,, the metric tensor of the unit sphere. We first calculate the cor- 
responding three-index symbols 


Ox; Ox, Ox, 


= = 


We can now form differential invariants and covariants. Taking the Painvin 
function p for instance, which is a scalar, we can form its second covariant 


derivatives 
ap 
OX, Ox; OX, 


ap ap ) Op 


Ox 


—x — 

If we express this in terms of ¢=\? and take into account (1.8) and (1.12) 
we obtain 

ik 


Pik = 


Lic = + Gixp. 


170 

or 


1936] PSEUDO-MINIMAL HYPERSURFACES 171 


This formula is of a more general character than its derivation by means of 
stereographic parameters would seem to indicate. 

The tensor character of the /’s is put here in evidence by expressing it 
tensor-analytically in terms of a scalar, using the fundamental form of the 
unit sphere. 

We thus see that theoretically the Painvin function is the simplest. But 
in practice when considering special surfaces it is necessary to use explicit 
expressions rather than symbolic formulas, and then the potential ¢ in 
stereographic parameters furnishes the simpler expression for /;x. 

We may remark in conclusion that the last formula seems to be related to 
Minkowski’s developments concerning the Stiitzfunktion; see Blaschke, 
Vorlesungen tiber Differentialgeometrie, 1, §78. 


6. CURVATURE INVARIANTS 


The theory of hypersurfaces may be regarded as the theory of differential 
invariants of ¢ under the group of transformations which we have been con- 
sidering in the last sections. Instead of developing this theory we may, of 
course, use the classical theory of surfaces, and its generalization, the tensor 
analysis. 

From the general theory of hypersurfaces we know (see e.g., Forsyth, 
vol. 2, p. 39) that the curvature properties of a hypersurface at a point may 


be expressed in terms of the roots of the equation 

(6.1) | lin — = 0, 

where the vertical bars indicate the determinant of the matrix. These roots, 
which are obviously invariants, are called the principal curvatures of the 
hypersurface, and their reciprocals the principal radii of curvature. Very 
often instead of these irrational invariants their symmetric functions 
-- + In=kike- £&,) are considered. In the case 
n=2, for instance, the product of these roots is the total curvature, and 
the sum the mean curvature. All curvature properties at a point may be ex- 
pressed in terms of these invariants, which are rational functions in the g’s 
and /’s. The denominator of J, is the determinant of the g’s, and the numera- 
tor is the sum of determinants obtained from this determinant by replacing 
in all possible ways p rows of the g’s by the corresponding /’s. The vanishing 
of these various invariants characterizes important classes of hypersurfaces 
(for  =2, I, = 0 gives minimal and J,=0 developable surfaces). In m dimen- 
sions there will be ” such types; those corresponding to J; =0 have been con- 
sidered in the literature, and are known as minimal hypersurfaces. The prob- 
lem of determining such types of hypersurfaces, in other words of integrating 
the differential equations 
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is a difficult one, because of their non-linear character. The only case (” >2) 
known to us where the hypersurface has been determined is that of the axially 
symmetric minimal hypersurface in the case n=3. (The general case n>3 
does not present additional difficulties.) They have been determined by A. R. 
Forsyth (loc. cit., vol. 2, p. 328), and independently by M. Born, in whose 
theory they serve to describe the field of an electron.* 

Using stereographic parameters, equation (6.1) may be simplified con- 
siderably if we use for the g’s their expressions (1.11) in terms of the /’s. The 
equation becomes 


(6.3) | 64; — = 0, 

and the expressions for the invariants 7, become much simpler. In particular 
7,,, turns out to be 

(6.4) = 

Its vanishing is equivalent to the differential equation 

(6.5) l,, = 0. 


The corresponding hypersurfaces have the property that the sum of the prin- 
cipal radii of curvature is zero. For n=2 they happen to coincide with mini- 
mal surfaces. But for n >2 they do not seem to be derivable from a variation 
principle as are the hypersurfaces corresponding to J;=0. These latter may 
properly be called minimal. We introduce for hypersurfaces which satisfy 
(6.5) the term pseudo-minimal. Part II is devoted to the investigation of 
these. 
Part II 


7. DIFFERENTIAL EQUATION OF PSEUDO-MINIMAL HYPERSURFACES 
AND ITS REDUCTION 


We have seen that the hypersurfaces in question are characterized by the 
equation /,,=0, or, according to (1.12), 


= 0. 
Using the relations (1.8) which define the a’s, we find 


(7.1) — + nd = 0, 
or 


dg 
AAgd — nx, — + nbd = O. 
OX, 


* Proceedings of the Royal Society, (A), vol. 143 (1934), p. 410, and vol. 144 (1934), p. 425. 
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This equation is fundamental in the study of our hypersurfaces. 
If ¢ is a solution of this equation, formulas (1.7) may be written as 


1 
Xo = — — Ag; X; = ¢ — — Ad. 
n n 


Using this in (1.8) we find for aix 


2 2 
nN 

Therefore every pseudo-minimal hypersurface may be given by the in- 
tegral formulas (2.1) in which the a’s have the above values. We call ¢ the 
potential because it appears as an auxiliary quantity in terms of whose deriva- 
tives the quasi-tensor a,, (and the tensor /;,) may be expressed. 

The determining partial differential equation (7.1) is of such form that 
it is possible to reduce its complete integration to that of an ordinary differ- 
ential equation and of the u-dimensional Laplace equation. This can be 
brought about in several ways. We may, for example, introduce u-dimen- 
sional polar coordinates 7, 0,, 62,---, 6,1. Then the familiar process of 
separation of variables allows us immediately to split off the radial differ- 
ential equation, while a partial differential equation in the angles 6,, - - - , An 
remains which is readily identified as the differential equation of the surface 
harmonics on an (m—1)-dimensional hypersphere. 

The method adopted here is not essentially but only formally different 
from the one just described. Since polar coordinates of m dimensions are 
clumsy, it is preferable to establish connection with the solid harmonics rather 
than the surface ones. 

We therefore make the “Ansatz” 


(7.3) oi = “fr; 


where H,; is a homogeneous function of degree />0 and where f; is a function 
of the radius-vector or (which is equivalent) of \. Substituting (7.3) in (7.1) 
and using repeatedly Euler’s theorem x,0H,/0x, =1H1, we obtain 


(7.4) \fAH + PH = 0, 

where 

(7.5) P = — 1)f” + (n — mr + 2A)f’ — nl — Vf. 

Here primes denote differentiation with respect to X. In order to separate 


variables we write (7.4) in the form 


(7.6) 
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Now the right-hand side being a function of r alone has a constant value on 
a hypersphere, while the left-hand side being a homogeneous function of 
%1, °° *,%, of degree zero has constant values along a radius vector. There- 
fore the common value of the two sides must be a constant, say k. We have 


(7.7) = 
(7. rP = 

We shall now show that we may always and without loss of generality 
put & equal to zero. For we can always, without changing the form of the 
“Ansatz” (7.3), multiply f; by r* and H, by r~*, and we may assume that 
H, is regular at r=0 but can no longer be divided by a power of r without 
introducing a singularity at that point. If this is the case, then equation (7.7) 
implies k=0 and we have 
(7.9) AH = 0, 

(7.10) P =X(2d — 1)f"” + (n — m+ — — 1)f = 0. 
Homogeneous solutions of (7.9), which is the m-dimensional Laplace equa- 


tion, are well known; they are the hyperspherical harmonics. 
There exist for every degree / and dimension number 


— 21+ — 2) 


independent solid spherical harmonics. In particular, in three dimensions 
there are 2/+1 of them which are connected in the familiar way with the 
Legendre polynomials. For higher dimensions the reader may look up the 
corresponding Gegenbauer functions in the Encyclopaedia article by Appell.* 


8. DISCUSSION OF SOLUTION OF THE RADIAL DIFFERENTIAL EQUATION 


The differential equation (7.10) is immediately recognized to be of hyper- 
geometric type. Introducing 


(8.1) 
as independent variable, we obtain the equation in standard form: 


d*f n \ df 


with the Gaussian elements 


* P. Appell and A. Lambert, tome II, vol. 5, fascicule 2 (II, 28a). 
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The well known theory of the hypergeometric function* furnishes us with 
integrals in the vicinity of the three singular points 0, 1, and ©; it suffices 
here to consider those around ¢=0. We have then as a first regular integral 


(8.4) = +52), 


Here F(a, 8, y; £) is the well known Gaussian series: 


oB § 
5 F(a +— — — 


The second integral may be written in either of the following forms: 


(8.6a) = ( 


Now it is evident that when the first or second argument a, 8 of a Gaus- 
sian hypergeometric series F(a, B, y; £) is a negative integer, the series de- 
generates into a polynomial, the Jacobi polynomial; and that, when the third 
argument y is a negative integer, the series F(a, 8, 7; £) loses its meaning. 
Thus since different kinds of integers may appear according to whether the 
number of dimensions m is even or odd, we shall separate the discussion of 
these two cases. 

(1) 2 is an odd integer. Inspecting (8.4) and (8.6) we see that, due to the 
appearance of #/2 in the third argument, we are safe from any breakdown 
of solutions. As for appearance of polynomials, we see that f will be poly- 
nomial if /—1 is negative or zero, i.e., for ]=0 and /=1; in which cases we 
have 


(8.7) = A, 
(8.8) = 1, 


For /=2 the series no longer breaks off. We mention, however, that even then 
the series may be summed and written in a closed form, involving polynomials 
and logarithms. 


* See, e.g., Whittaker and Watson, Modern Analysis, 3d edition, 1920, chapter 14; Courant- 
Hilbert, Methoden der Mathematishen Physik, vol. 1, chapter II, paragraph 10, p. 74. 
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Looking at the other solution we see that it always is of the following 
form: negative power of ¢ times a Jacobi polynomial. It therefore behaves 
like 1/¢'+*/?-", i.e., like 1/r?'+"-? in the vicinity of the origin. Closed expres- 
sions are obtained easily since the Nth Jacobi polynomial can be written as 
Nth derivative of a simple generating function with respect to ¢. We have, 
using (8.6a), 


8.9a (2) = @ r™l-2 
( ) fi ad ( ); 
or using (8.6b), 


d 
8.95 ac 
dr 


(2) m is an even integer. Putting m = 2m we see that now the first integral 
always (i.e., not only for ]=0, and 1) degenerates into a Jacobi polynomial 
by virtue of its second argument being —m. For/=0 and /=1, (8.7) and (8.8) 
still hold, but for /=2 the following expression may be used: 


d m 
(8.10) = ( ). 

We pass on to the discussion of f‘? for even nm. Here we have to expect a 
breakdown of the solutions (8.6a) and (8.6b), for the third argument 2—1—m 
is always a negative integer. It may however happen, and these are the “ex- 
ceptions of the second order,” especially treated by Klein,* that a negative 
integer as first or second argument causes the series to break off before it 
has a chance to acquire infinite terms on account of the third argument being 
a negative integer. In this case, therefore, one of the first two arguments must 
be a larger negative integer than the third; or, since both members are nega- 
tive, it is better to say that the absolute magnitude of one of the first two 
arguments must be smaller than that of the third. Inspecting (8.6a) we see 
that /+2m—1 will always be larger than /+m-—2 but that (from the second 
argument) 


(8. 11a) as soon as/ = 2. 


Comparing the first and third arguments in (8.6b) we see, similarly, that 
(8.11b) 0s1-—251+ 2,assoonas/ 2 2. 


The exceptions of second order are therefore the rule here; and we obtain 
from (8.6a) 


* Felix Klein, Vorlesungen tiber die H ypergeometrische Funktion, p. 35. 
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m l—m+1 
(8.12a) fi) 1 (=) ( ) 
aN 


and from (8.6b) 


‘ 1-2 
dr 


(8.12b) is of course the same as (8.9b), but now it would be no longer correct 
to apply it for /<2. 
Solutions fo and f;‘” are easily obtained, since we know in each case the 
other solution, viz. (8.7) and (8.8); we have, using a well known formula, 
(8.13) = (1 + | 
(1+ 
1— 2m 
(1 — §) 4 


(2) = 
(8.14) fy 


Expanding and integrating by terms, we find expressions containing loga- 
rithms and power series beginning with 1/¢"—' and 1/¢ respectively. Of 
course closed expressions can also be obtained easily by direct integration. 

In the following table the above results are summarized. The two columns 
of the table give the solutions for m even or # odd; the rows for various 
values of /: 0, 1, and from 2 on. Each row is subdivided and gives in its upper 
part f and in its lower part f. 


n odd 


1—r? 


d\* 
G) ( (1—r?) log +br 


1 1 


dx 


n n p2ltim—2 


(a) Ca) (5) Cae) 


i 
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One sees that always one solution is algebraic; but a curious complemen- 
tary character is exhibited in the occurrence of the case where both solutions 
are algebraic functions, i.e., where the general integral of (8.2) is algebraic, a 
case studied by H. A. Schwarz. For odd this case occurs for /=0 and /=1; 
for m even only for /=2. Needless to say, in these cases the conditions given 
by Schwarz are fulfilled. Incidentally the Schwarzian conditions show that 
for /=2 and x odd, the first solution is indeed a transcendental function of ¢. 

The appearance of algebraic solutions has been discussed above somewhat 
more in detail in view of the theorem of §3. 

Until now we have considered only positive values of /. If we wished to 
consider also negative values of / we should find that the following relation 
holds: 

as can be proved either by direct computation or by applying the general 
theory of the Riemann P-function. This is in agreement with the fact that 
one can obtain from every homogeneous harmonic function of positive degree 
l by multiplying it by r-?'-"*? one of negative degree —/—n-+2, and there- 
fore if we admit negative.degrees in (7.3) every potential may be split up in 
two ways: 


@ = Aifi = 


furnishing thus two solutions of radial differential equations. One of them is 
regular and the other singular, in keeping with what is indicated by the super- 
scripts in formula (8.15). 

We see thus that we have obtained a large class of potentials which may 
be presented as products of a harmonic function and a hyperge »metric func- 
tion of the radius vector. In case of a singularity at the origin we have a 
choice between two such representations, in one of which the hypergeometric 
function and in the other the harmonic function has that singularity. 


9. PSEUDO-MINIMAL HYPERSURFACES OF REVOLUTION 


As in the case of the Laplace equation the spherically symmetric solutions 
of the differential equation (7.1) are of special interest. Considering in formu- 
las (1.7*) @ as a function of r alone we obtain a hypersurface of revolution 
with the Xo axis as axis of symmetry. In other words, the intersection of the 
hypersurface with a hyperplane X)=constant will be an (mw —1)-dimensional 
hypersphere. We see from the theorem of §3 and from the table of the last 


} H. A. Schwarz, Gesammelte Abhandlungen, vol. 2, pp. 211 ff. 
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section, that for odd the surface will be algebraic, for 2 even transcenden- 
tal. We shall discuss a few typical cases. 
n=3. Using (8.6a) we have, suppressing a multiplicative constant, 


1 
=—--6r+r'. 
r 


Substituting this into (1.7*), we obtain 


X; 
Xo 


Introducing the u-dimensional radius vector R®=X,X,, we see that the 
“meridian curve” is the parabola which, adjusting a scale factor, may be 
written 
Xe = 2R 1. 
1 1 

r r 

and the surface is 
X 


rT 
As meridian curve we obtain 
54(X2 — 2R+ 1)? = 
As an example of even m we treat the case of m =2. Equation (8.13) gives 
= (1 — r*) log (— 7’) — 4(1 — r*) + 4, 


or, since we are free to add a multiple of the other (trivial) solution 1—7r?, 
we may write, adjusting conveniently a multiplicative constant, 


@ = 3(1 — logr + 3(1 4+ 7’). 


This gives as surface 


; 1 1 
Xo = — logr, 
r 


which is the familiar catenoid. 


+ The solution ¢=1—r? gives only a point. 


if 

1\? 
= ——([r+— 
r 
1 é 
=4(r——)}. 
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10. CONCLUDING REMARKS 


The material presented may be developed in several different directions. 
In this conclusion we shall indicate briefly some of these developments. 

(a) Alternative, Maxwellian, method of obtaining solutions of (7.1). Both 
the Laplace equation and our equation (7.1) involve a type of operator con- 
sidered by Casimirf in connection with the general semi-simple group. In 
the general case, the operator is of the form 


X 
K= g “DD, Sim = 


where the D, are the infinitesimal operators of the group and cj» the structure 
constants of Lie’s third fundamental theorem; and it is readily seen that any 
D commutes with K. It foliows from this that from any solution ¢ of a differ- 
ential equation 


(K + al)¢ = 0 


(where J is the identical operator and a@ a constant), other solutions may be 
obtained by applying any one of, or a succession of, the operators D, to ¢. 

In the case of the Laplace equation, the group in question is the group of 
translation and a=(. The operators D, are here 0/dx,, and we are led to 
Maxwell’s method of obtaining solutions of the Laplace equation, which in 
case n = 3 and using ¢ = 1/r happens to give essentially all solutions. 

Turning to our differential equation (7.1), a straightforward calculation 
shows that it can be written in terms of the infinitesimal operators (4.9) and 
(4.10) of the group Q in the form 


( Mie + = 0. 

k ick 
This leads to an alternative, Maxwellian, method of constructing solutions. We 
may, by elementary methods, obtain a central symmetric solution ¢o. From 
it an infinity of other solutions may be derived by acting upon it with Q or M 
or a combination of them. We thus obtain linear combinations of the particu- 
lar solutions of §§7 and 8. The relationship between the former point of view 
and the present is best exhibited by the following formula: 


= adi-r + 


where a and 6 depend only upon /. This formula is readily derived, using a 
recursion formula. 


+ Proceedings of the Royal Academy of Amsterdam, vol. 34 (1931), p. 144. See also Rotation 
of a Rigid Body in Quantum Mechanics, by the same author, Leiden dissertation, 1931, chapter 4. 
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(8) Series expansion. The potentials ¢; introduced in §§7 and 8 may be 
used for expanding a general potential into a series. Assume that ¢ is a 
pseudo-minimal potential regular within a sphere of radius 1. On the surface 
of a sphere of radius #<1, @ may be expanded into a series of surface har- 
monics 


¢= aif, 


and the series }\a:H,, where the H, are the corresponding solid harmonics, 
converzes in and on that sphere. Consider now the series 


l 
ai H le 
If we can prove that the quantities {,/f; are bounded in the sphere, we shall 
know that this series converges uniformly, and, since each term satisfies our 
differential equation, we shall have (taking into account the analytic char- 
acter of solutions of an equation of elliptic type) a pseudo-minimal potential. 
Since on the surface of the sphere f; reduces to f;, the series reduces to that 
for @, so that we have a potential which must coincide with our original po- 
tential, because it coincides with it on the surface of the sphere. 

The proof of the statement that the quantities f,/f, are bounded is based 
on the remark that, according to (8.4), for sufficiently large values of / the 
coefficients of the expansion for f‘ can be made as close as desired to the 
coefficients of the binomial series for (1 —¢)*/?. 

(y) Explicit formulas connecting harmonic functions with pseudo-minimal 
potentials. The preceding section contains a method by which we may ob- 
tain from a harmonic function, by developing it into a series and multiply- 
ing the terms by appropriate functions of 7, a pseudo-minimal potential. In 
the case of an even m we have succeeded in expressing this connection in an 
explicit form; for »=2 and n=4, for instance, we have respectively the fol- 
lowing formulas: 


= + AK,x,, 
and 
= 3uH + + 
For every harmonic function 3¢ these formulas give a pseudo-minimal 
potential, as may be verified by a straightforward calculation. 


(6) Hypersurfaces for which the sum of the radii of curvature is constant. 
These may be included in the present theory without any effort. This sum 
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is equal to the invariant J, for which we found the formula (6.4). Equating 
I,,1 to a constant c, using (1.12) and (1.8) we obtain 


Adon — — +c = 0, 


which shows that x =¢+<c/n is a solution of the homogeneous equation (7.1). 
We see thus that in this fashion, by adding a constant to a pseudo-mini- 
mal potential, we may obtain a representation of a hypersurface J,_; = const. 
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NORMAL DIVISION ALGEBRAS OF DEGREE 
OVER F OF CHARACTERISTIC 


BY 
A. ADRIAN ALBERT 


1. Introduction. In a recent paperf I proved that a normal division alge- 
bra D of degree p, a prime, over a field F of characteristic not #, is cyclic if 
and only if D contains a sub-field F(y), y?=7 in F. This result evidently leads 
to the conjecture that any normal division algebra D of degree n over F is cyclic 
over F if and only if D contains a maximal sub-field, F(y), y"=¥ in F. 

The conjectured criterion given above would be of fundamental impor- 
tance for the theory of the structure of normal division algebras. Without 
loss of generality we may assume that n= p*, p a prime, and the theory then 
gives rise to two distinct cases according as F does or does not have charac- 
teristic p. We shall consider the former case here and give a brief simple proof 
of the criterion. 

2. Cyclic fields of degree p*. Let F be a field of characteristic »#0. An 
equation 
(1) P=A+a in F) 
is called a normed equation. If xisa root of (1) soarex+1,x+2,---,x+p—1, 
and we have the Artin-Schreier lemmas:{ 


Lemma 1. A normed equation is either cyclic or has its roots in F. Every 
cyclic field of degree p over F may be generated by a root of a normed equation. 


Lema 2. Let Z=F(x) be cyclic of degree p over F, 
(2) x>=x+a (a inF). 
Then a quantity xo of Z satisfies a normed equation if and only if 
xo = kan +b (k=0,1,---,p—1;b imP). 
Let Z, be cyclic of degree p* over F so that 
(3) 


where Z; is cyclic of degree p‘ over F, cyclic of degree p over Z;_:. I have 
proved{ that Z;=F(x,), 


* Presented to the Society, April 20, 1935; received by the editors March 26, 1935. 

t These Transactions, vol. 36 (1934), pp. 885-892. 

t For the properties of this section see my paper in the Bulletin of the American Mathematical 
Society, vol. 40 (1934), pp. 625-631. 
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(4) = + (a; in Z;_1), 
and that Z, has a generating automorphism S given by 
(S) = + Bi, 
where* 
(6) = Tzyr(Bi) = (— 
and 
(7) af — a; = B — B;. 

Every quantity of Z, has the form 


(8) 


t;=0 
Write ao =a -1,...,p-1 SO that 
@ = aBer1 + ao. 
I have proved that there exists a quantity c in Z, such that ap=cS—c. Then 
Tz./r(ao) = 0, Tz.jr(a) = (— 1)*ao, @ = aBeitcS —c. 
If Tz.,r(a) =0 then and a=cS—c, while conversely a=cS—c implies 


that Tz,,r(a) =0. When also a=d5—d then d—c=y has the property y =, 
7 isin F. We thus have 


Lemma 3. Let Z, be cyclic of degree p* over F and with generating auto- 
morphism S. Then 


(9) Tzjr(a) = 0 (a inZ.), 
if and only if 
(10) a=cS—¢ (c in 


Moreover (10) has a unique solution c apart from an additive constant in F. 


Lema 4. The field Z, of Lemma 3 contains a quantity 8.41 such that 
(11) Tz.r(Ber1) = (— 1)° 
and every a of Z, has the form 
(12) a = Tz;r(a)(— + — 

In particular I have proved that 


(13) — Boi) = 0 


* We write Tz,p(a) for the trace of the quantity a in Z over F. 
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so that B?,,;—8.41=4a>,,;—4.41. Then I have shown that the field F(x...) deter- 
mined by 


Pp 
(14) = + 


is cyclic of degree p**+! over F with Z, as sub-field and generating auto- 
morphism given by 2e41=4e41+Be41 and that of Z.. 

3. Cyclic algebras of degree p over F. Consider m-rowed square matrices 
A with elements in an infinite field F of characteristic p. 

Two n-rowed square matrices A, B with elements in F are similar in F 
if and only if they have the same invariant factors. The minimum equation 
of A is the equation obtained by setting its invariant factor $(A) of highest 
degree in \ equal to zero. When ¢(A) has degree m it coincides with the charac- 
teristic polynomial of A and every B such that ¢(B) =0 is similar to A. 

In particular let »=p and y?=y in F, y be in a total matric algebra M 
of degree p over F. By a proper choice of the representation of M by the 
algebra of all p-rowed square matrices with elements in F we may take 


rep eo @ 
001--. 0 


O--- 0 


Since F is an infinite field there exists a quantity £0, 1,---,p—linF 
and thus 


(16) 


et 


is non-singular. A trivial computation gives 
(17) x? =x+a, yu = (x + 1)y, 


where a= £?—£isin F. 

We now let D be a normal division algebra of degree p over F and let 
y? =7 in F for y in D but not in F. There exists a separable field K =F(n) 
of degree p over F such that Dx is a total matric algebra over F. Then D is 
equivalent to an algebra of p-rowed square matrices with elements in K and 
the quantities 1, 7, - - - , ”~! are linearly independent in D, the quantities 
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1,y, - - -,y?~! are linearly independent in K. We have thus proved that there 
exists a quantity x in Dx such that yx =(x+1)y. 
We write 


(18) Xo t + Xp? (x; in D) 
and the equation yx =(x+1)y is equivalent to 

(19) [xyo — + 1)y] + (yar — + + — = 0. 
Thus yao=(xo+1)y where y¥0, x) #0 are in D. The minimum equation of %o 


has degree p over F and xo+1 in F(x) as a root. The field Z =F (x9) is cyclic 
over F and D is a cyclic algebra. The converse is well known and we have 


THEOREM 1. Let D be a normal division* algebra of degree p over F of charac- 
teristic p. Then D is cyclic if and only if D contains an inseparable sub-field 
F(y), in F. 

4. Cyclic fields over K=F(y). Let K=F(y) be inseparable of degree p 
over F’, y?=y in F, and let Z=Z, be cyclic of degree p* over K. Then (3)-(7) 
are satisfied with 6:41, in Z;= K(x,;). Write a, with a; in F so 
that a” =) =) is in F. Then has the property 
Xot — Xor = — 41)? =a)? =a isin F. But in fact xo. =x1 +a; generates K(x). 
Hence 

XK 
where Zo, is cyclic of degree p over F. We may in fact prove 


THEOREM 2. Let Z be cyclic of degree p* over K=F(y), y?=y in F. Then Z 

is the direct product 
Z=Z,XK, Z,=F(x), Z= K(x), 
where Z, is cyclic of degree p* over F. 

For let the above theorem be true for the sub-field Z;_, of Z.. Then 
and of (6) isin Z;_1,0. We also have Z;=Z;_1(x;), x” =x:+a; 
and may write aio =) in Z;_1,0. The quantity xi9=x;+<; 
=x,” generates Z; over K and =Xiot@io, Xj =Xio where aio and are 
in Zj1,0. Thus and Z;=Zio XK. The induction is complete 
and Theorem 2 is proved. 

5. Cyclic algebras of degree p* over F. We shall now prove 

THEOREM 3. Let D be a normal division algebra of degree n= p* over F of 
characteristic p and let F(y) be a maximal sub-field of D, y"=y in F. Then D 
is a cyclic algebra 


* If F isa finite field there exist no normal division algebras of degree greater than unity over F. 


NORMAL DIVISION ALGEBRAS 
(Z, S, 
where yx =x%y for every x of Z. 
For assume that the theorem is true for algebras of degree p‘< p* and let 


D have degree p* over F and contain y such that y"=7 in F, F(y) is a maximal 
sub-field of D. Define 


m= pr, Yn = y™, 


so that the algebra B of all quantities of D commutative with y,, is a normal 
division algebra of degree m over K =F (ym). By the hypothesis of our induc- 
tion there exists a cyclic field Z, of degree m over K in B such that ys =2z%y. 
Theorem 2 states that Z>=Z._1XK where Z,_, is cyclic of degree m over F. 
Any change in the generating automorphism of Z> is accomplished by re- 
placing y by y’, r prime to ~, so we may assume without loss of generality 
that yz=zSy for every z of Z.1 where S generates the cyclic automorphism 
group of Z,_,. Write =F (x._1). 

The algebra G of all quantities of D commutative with x,_; is a normal 
division algebra of degree p over Z,_; and contains y,,. By the hypothesis of 
our induction there exists an % in G such that y,%1=(*n+1)ym. Then 
Xo has the properties 


Xo? = Xo + ao, = [xo + (— ym = 


with ao in Z,_:. The quantity y transforms x» in G into 


yxoy = Xo inG, = Xoy + YmXoy = (Xoy + 5)¥m 
where 
6 = (—1)**= Tz, = VYm- 
Write xoy =>” with b; in and have 


p—l 
bi (x0 + 5) 5 Vm + >> b:(xo 6) = in Z-1(X0)- 


i=0 i=0 


Thus b,(%9+6) =b; is in for i=1, - - - , m, bo(%o+6) =bo+1. By Lemma 2 
we have b)>=kx +8 with & an integer and in Z,_;. Then (%0+6)k+8 
=kxot+B+6, kR=1, to =x0+8, 


= Xo + P(ym) 
where* P(ym) is in 


* Note the analogy between this result and the theorem that yor=2xyo if and only if yo=Py 
with P in F(x). 
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The field Zo=Z,1(¥m)=Z.1XF (ym) is cyclic of degree p*-' over 
K=F(ym). Thus yxoy-!=xo+P, y’xoy +P+P%, and finally 


y™xoy—™ = Xo + TzyKx(P) = = Xo + 5, Tz x(P) = 


and, since Z)>=Z,_,XK, 
Tz x(B- P) = 0. 
We apply Lemma 3 and obtain a quantity g in Z) such that gS5—g=8,—P. 
Define and obtain 2), =2%.0+420, 
= (Xoy + = (w+ P+e+8. — P)y = (x. + Be)y. 
Then satisfies (x5)?=«5+a,, and K(xeo) is cyclic of degree p* 
over K. By Theorem 2 the field K(x.) =K XZ. where Z, is cyclic of degree 


p* over F and has the same generating automorphism asZ,». In fact Z. =F (x.), 
=x +B,. We have proved Theorem 3. 
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SOME ARITHMETIC MEANS CONNECTED 
WITH FOURIER SERIES* 


BY 
L. S. BOSANQUET 


1. Introduction. It is known that if a series}, is bounded (C, y), y= —1, 
then it is either summable (C, y+), for every 5>0, or not summable (A).f 
Further a necessary and sufficient condition for it to be bounded (C, y) is 
that it should be bounded (A) and the sequence ma, bounded (C, y+1).f 
Conditions have been obtained§ under which a Fourier series }>A, cos nt or 
an allied series ).B, cos nt should be bounded (C) for t=0 and, in the case of 
the allied series, conditions under which the sequence ”B, should be bounded 
(C). But the problem of the boundedness (C) of the sequence 7A, has appar- 
ently not been considered directly. || 

We suppose that f(#) is integrable L, and periodic with period 27. We write 


o(t) 
= 3{f(x+ 4 f(x 9}, 


and suppose that the Fourier series of ¢() and y(#) are respectively 
>A, cos nt and > B, sin nt. 
n=0 n=1 


Then the Fourier series and allied series of f(#) at the point =x are respec- 
tively 


and 


n=0 n=1 


We write, for ¢>0, 


* Presented to the Society, September 13, 1935; received by the editors March 5, 1935. 

¢ Proved by Littlewood, 17, for 5=1 and y=integer, and completed by Andersen, 1. See also 
Hardy and Littlewood, 13a, Kogbetliantz, 15, p. 38. Numbers in heavy type refer to the list of ref- 
erences at the end of this paper. 

} Hardy and Littlewood, 13a, p. 283. Kogbetliantz, 15, p. 38. 

§ The first systematic results of this type were given by Hardy and Littlewood, 11, 12. 

|| Sufficient conditions for the existence of the Cesaro limits of nA, and nB, have been given by 
Young, 24, 25, and the limits of the arithmetic and logarithmic means of nB, have been considered by 
other writers. See Zygmund, 27, where references are given. 

{| In the usual notation Ap=43a9, An=an cos nx+b, sin nx and B,=b, cos nx—dp sin nx, n>0. 
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u 

1 
T'(a) 
#,(t) = 


®.(t) = 


f (t — u)*'9(u)du, a> 0, 
0 


®,(t) = 0, 
dt 

a(t) = T(a + 1) *®,(2), 
d 

¢-r(t) = 


and we define V.(t), Ya(t), Qa(t), Oa(¢) in a similar way. We call ¢.(¢) the 
mean value of order a of #(#). 

We also write s,7, 5.4, 7,.*, 7,.* for the Cesaro means of order a of 
Sn=AotAit::- +A,,* 5,=BitBo+:-- +B,, T,=nA, and -,=nB, 
respectively. Finally we write, for a0, 


R.(w) = (w — m)*An, R,(w) = (w — n)*By, 


n<w n<w 
and 
ra(w) = w*R,(w), Fa(w) = w*R,(w). 


It is knownf that if | .()| =O(1) (C, 1), ie., 
(1) a(u) | du = O(1) 


in an interval (0, 7), then sf =O(1) for 8>a, where a=0. The Cesaro summa- 
bility and boundedness of a Fourier series for negative orders is not a “local” 
property of the function, since we usually only know that A,=o(1), 
but the above result remains true when —1Sa<0 if we take n=7, 


* Thus , 
v=0 
1) 
"Te + + 
1 Thus ra(w), 7a(w) are the Rieszian arithmetic means of order a of the Fourier series and allied 


series respectively. 
t See Bosanquet, 6, where references to previously known special cases are given. 
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provided ,4:(¢) is absolutely continuous and ®,4;(+0)=0, in the case 
a> -—1,* and is absolutely continuous except at ¢=0 in the casea= —1. 
Dr. A. C. Offord and I have recently shown that for the class of Fourier series 
for which A, =o0(n7), y>-—1, summability (C, y) at the point = does in 
fact depend only on the local properties of the function, and there is an 
analogous result for boundedness (C). It follows from our conditions that, 
if we restrict ourselves to series for which A, =O(n*), the above result re- 
mains true, with the condition in its localized form, when a = —1.} Condition 
(1) may also be replaced by the more general form 


1 t 
(2) J | = O(1)t 


when a> —1,or 
(3) d{ up(u)} | = O(1) 


when a= —1.§ There are also “converse” results,|| in which summability or 
boundedness of the Fourier series is given as an hypothesis. It follows in par- 
ticular from these results that a necessary and sufficient condition that 
s# =O(1) for some @ is that (1) should hold for some a, or, what is the same 
thing, that ¢;.(¢) =O(1) for some k. 

The investigations of this paper arise out of certain identities which play 
an important role in the theory of Cesaro means. If a>0, we have 


and** 


(5) a{dai(t) — ba(t)} = (t) = xa(Z), 


* In these circumstances $41(t) = /,a(u)du. The last condition is necessary in order to exclude 
a function like f(#) = | t-— x| —a-1 —1<a<0, whose Fourier series diverges for ‘=x. If ¢(/)= | t| —m, 
we have %,4:(/)=I'(—a@), and hence ¢2(#)=0 for every ¢>0. 

t See Bosanquet, 8, Bosanquet and Offord, 9. 

t Anintegral like (2) is to be interpreted in the first instance as lim, f¢, where it is assumed that 
,,4:(u) is of bounded variation in every interval (e, ). But when condition (2) is satisfied it may be 
shown that 4,,:(2) is of bounded variation in (0, ¢),and so the integral exists as an ordinary Lebesgue- 
Stieltjes integral. See Bosanquet, 8. 

§ Condition (3) is Young’s well known condition. Young, 26, stated that it was sufficient for 
boundedness (C, —1-++4), for 5>0, and this was proved by Hardy and Littlewood, 13, in the case 
n=T. 

|| See Theorem 6, p. 201, where further references are given. 
{| Kogbetliantz, 15, and 16, pp. 23 and 30. 
** For (5) and (6) see Bosanquet, 6 and 7. The analogy between these and (4) is obvious. 
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where x.(¢) is the mean value of order a of é’(¢) whenever é¢(¢) is an integral 
vanishing at =0. We also have 


2 
(6) a { 6.(t)} (t) =- 


It will be shown here that, if 


1 t 
(7) J | — | du = O(1) 


in an interval (0, 7), then 


(8) SAP O(1) 


for B>a, i.e., nA, =O(1) (C, B+1). This is true with any 7, 0<y<7, when 
a=0, and with »=7 when a= —1, provided ¢a4: is absolutely continuous 
except at ¢=0. For the class of Fourier series for which A, =O(m*) there is a 
corresponding “localization” problem, but that belongs to a rather different 
line of ideas and will not be discussed in this paper. It is clear from (4) and 
(5) that the results just stated are covered by Theorems 1 and 2. It has been 
shown elsewhere that (7) is sufficient for the summability (C, 8), 8>a, of 
the Fourier series whenever it is summable (C). This result appears again 
(Theorem 3) as a corollary of Theorem 1, but for the more general class of 
Fourier series summable (A). There are “converse” results in which bounded- 
ness (C) of the sequence ”A,, or a more general condition, may be taken as 
hypothesis (Theorem 4), and the problem of the boundedness (C) of nA, may 
also be solved in a “necessary and sufficient” form (Theorem 5). Finally this 
problem and the corresponding one for allied series may be regarded as the 
starting point of a sequence of more delicate problems of the same nature 
(Theorems 7 and 8). 

2. Before proving the theorems of this section we state as lemmas some 
results which help to explain the hypotheses.* 


Lemma 1. Jf a> —1, necessary and sufficient conditions that (2) should hold 
and ®,,:(+0) =0 are that 


(0) f ‘1 | | = 
and =O(1) for some k=0. 


* These have been given elsewhere, Bosanquet, 8. 
+ This integral is to be interpreted in the sense lim,.of?. 
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Lemma 2. Necessary and sufficient conditions that (3) should hold are that 
(9) should hold with a= —1 and ¢,(t) =O(1) for some k=0. 

Lema 3. If a= —1, and (9) holds, then it still holds if xis replaced by B>a. 

There are analogous results with series in place of functions. We also re- 
quire the following lemma. 


Lemna 4. If B>a>-—1, and (i) ®a4:(t) is of bounded variation in an inter- 
val (0, n), (ii) Bayi1( +0) =0, then Bg(t) exists for almost all t in (0, n), and satis- 
fies the relation 


T(B — a) Jo 


THEOREM 1. Jf a=0, and 
1 t 
(11) — f daw) | = 000) 
0 


in the interval (0, 3), then nA, =O(1) (C, a+84), for every 5>0.7 
There is an analogous theorem with o in place of O, and, more generally, 
we have the following result.f 


THEOREM 1a. Jf a20, (11) holds in the interval (0, +) and 
1 t 
(12) —f | udoa(u) + sdu | = o(1) 
0 


as t—+0, then nA,—s (C, for every 5>0. 


When a21 it is enough to consider the analogous theorem for Rieszian 
means. We therefore begin by proving the following theorem, which is the 
same in principle, but rather simpler in detail. 


* It is easily seen, by integration by parts, that (11) is equivalent to /;'u°|déa(u)| =O(t"),o>0, or 
f doa(u) | = O(-*), p > 0. 
t 


| When a+ 621 the conclusion is still true if (11) holds in some interval (0, 7). 
t If we observe that 


=. t 
log | cosec | ~> 
1 n 


we easily see that Theorem 1a may be reduced to the o analogue of Theorem 1 by replacing ¢(¢) by 
(t)—s log | 


| 
f 
# 
4 
fi 
be 
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THEOREM 2. and 


1 t 
(13) —f u| od (u)| du = O(1) 
0 
in an interval (0, n), then 
(14) nA, = (C,a—1+8)* 


as w—®, for every 6>0. 
Proof of Theorem 2. We have 


(w — n)'nA, = (w — —(w- n)}An 
n<w n<w 


= Bl — } 


wrg (w), 


and so we have to show that wrj (w) =O(1) as w>~, for 8 >a. There is no 
loss of generality in supposing that 8 <a+1. Now it has been shown else- 
where? that 


= w f galt) (wt)dt = f ba (=) Js (u)du, 
0 0 


where 


h being the greatest integer not greater than a, and 7:48(x) being given by 


1 
(16) Y1+8(%) = f (1 — )® cos xudu.t 
0 
It has also been shown§ that, for x >0, 
<A 
(17) | (x) | 


By differentiation under the integral sign we obtain|| 


* This is equivalent to nA, =O(1) (C, a+). Cf. Hobson, 14, pp. 90-98. 

+ Bosanquet, 6. The formula is valid for B2a20. 

t Ya(x)=T'(a)x-*C,(x), where Ca(x) is Young’s generalization of the cosine function. Young, 
23. 


§ See Bosanquet, 3 and 6. Here and elsewhere A denotes some number independent of the vari- 
able, or variables, under consideration, and is not necessarily the same at each occurrence. 
|| It will be seen that the resulting integral converges uniformly for w=>«>0, when B>a. 
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1 u 
— f (=) f td (t) (wt)dt. 
0 0 
Now write, for 
— (w) f -of +of +o f =J,+1.4+ ]s3. 
Then we have, by (13) and (17), 


40 |#2@| a = 00) 
0 


and, integrating by parts in the usual manner, 


| In| < f | = O(1) 


for 8>a. Finally, if we write 


we can show from the periodicity of #(¢) that J; =O(w*—*v2-*-"), uniformly 
in vy, and hence J;=O(w*-*), for 8 >a. This completes the proof. 

Proof of Theorem 1. In what follows we need only consider the case 
0<a<f<X1. The case a=0 follows from Lemma 3, with a in place of a+1, 
but a direct proof proceeds along similar lines. 

We first observe that ®,(t) is of bounded variation in (0, 7), and ®.(+0) 
=0. For, if 0<t<z, we have by (11), writing @*(¢) = d¢.(u)|, 


1 
f | | = f — | udu) | 
1 du 


= O(1) + 0 ( log 
It follows that 


2r 
ga(t) = 0 ( log *), 


and hence ®,(+0) =0. Hence also, for 0<t<z, 


195 

i 

d 
b=Def 
v=1 vr v=1 

4 

3 


L. S. BOSANQUET [March 
ra +1) f | = f | | 
t t 
fu | +a | du 
t t 


2a 
= O(1) +0(« log—) 
= O(1). 


Now, if x*(, t)+7ix(n, 2) denotes the Cesaro mean of order a of the se- 
quence 


1 2 
(18) —+— 
the Cesaro mean of order 8 of (2/7) sin nt is B(n+8)—k*-1(n, t). Since 
2 * d 
nA, = =f — (sin nt)dt 
Tv 0 dt 


we now have, by Lemma 4, with 0 in place of 8 and a in place of a+1, 


nt+Bdo dt 


= f at f (t — u)~*d®,(u) 


1\ ¢* 


1 1 d 
o(-) [4a(u)J(n, 1) Jo + o(—) wa, 
n n 0 du 
where 


d 
(19) J(n, u) = f (¢ — t)dt, 


provided the last two steps be justified. 
We next show that, for 0<u<z, 
An'te 


(20) | J(n, u) | { 


< Antta-by-8 


Now, for 0<i<z, 0<B<1, we have* 


* Cf. Zygmund, 27, and Gergen, 10. Both these papers are concerned with «*-1(n, ¢), but contain 
enough analysis to show how (21) and (22) are obtained. 
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(21) 
and 

< An?® 


An? 


d 


Hence writing, for u+n-!<z7,* 


1 


we have, by (22), 


utn =! 
| Ji | < Ants f (¢ — u)~* min (n*, t-*)dt 


utn=* 
min (n*, u-*) f (¢ — u)—dt 


= min (n8, u-*), 


and, by the second mean-value theorem, 


n dt 


n*O(n'-*) min (n*, u-*) 


by (21). This establishes (20). 

Returning now to the main theme of the proof, we see that the inversion 
of the repeated integral is justified, the resulting integral being absolutely 
convergent, and we obtain 


* When u-+n~!2 x the second integral does not occur, and the argument is simpler. 
Where 
t Since, by (22), 


< An? fe — u)~dt 


< Ant 


fora<1i. We use the analogue for Stieltjes integrals of Fubini’s theorem; see Bosanquet, 8. 
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1 d 
= + o(—) f ba(u)ut u)du 


= + o(—) E (u) 
n 0 dv 0 


It will be enough now to show that, for 0<u<z, 
An'tey2 


» 
(23) f v* — J (n, v)dv 
0 dv 
For, if this be established, it will follow that the integrated term is 
1 @ 
o(—) f v* — J(n, v)dv = O(n=-*), 
n dv 
and, writing 


1 u d 1 na! 1 
—f s)d0 = — f +—f = K,+ Ko, 
nN o . dv Nn Jo n J 


we shall have, from (11), integrating by parts in the usual way, 


< Ane | = O(1) 
0 


| K2| < f | = O(1) 


n 


for 8 >a. The theorem will therefore be proved. 
To establish (23), we first observe that 


u d u 
f v* — J(n, v)dv = (n, u) — af (n, v)dv, 
0 dv 0 
and the first inequality follows from the first inequality (20). We next ob- 
serve that, if ¢(¢)=1, then ¢.(¢)=1 for all ¢ and 7,°=0 for all m. Hence, 


following through our previous reasoning with this special value of ¢(#), we 
find that 


B 
0 = «— J(n, »)do. 
(n v)dv 


If we now write 
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u d 
f = f -f 
0 dv 0 u 


we have J’ =O(n'+«-*), and finally, observing that 


d 
f v)dv = [vJ(n, v) — af v*-lJ (n, v)dv, 
u v u 


we obtain, from the second inequality (20), J’’ =O(n'+«-*) +O(n!+2-8y-8), 
The second inequality (23) now follows, for 0<u<7. 

This completes the proof. 

We have as a corollary of Theorem 1, after our remarks in the introduc- 


tion, the following theorem. 

THEOREM 3. If a=0 and (11) holds in the interval (0, 1), then the Fourier 
series of f(t) is summable (C, a—1+-4) at the point t=x, for every 5>0, if it is 
summable (A). 

3. The next theorem shows that a condition of the type (11) is also 
necessary for the boundedness (C) of the sequence. 

We confine ourselves to the Rieszian form of the theorem. 


THEOREM 4. If a20 and 


(24) dr_(u) | = O(1)* 


as w—>© , then tpg (t) =O(1) in the interval (0, 7), for B>a+1. 

There is an analogous result with o in place of O. 

It foliows from (24) that r.(w) =O(log w) as wo. If then B>a+1, 
t>0, and his the greatest integer not greater than a, we have, by an argument 
used elsewhere, since A, =o(1), 


* The condition is satisfied in particular if a=1 and wr¢ (w) =O(1), i-e., 
= O(w) (C, — 1), 


n<@ 
or, what is equivalent, A,=O(1)(C, a). When a=0 (24) becomes 
n| An | = 
n<@w 


+ See Bosanquet, 3 and 4, The argument was previously used with the hypothesis Ra(w) = 0(w%). 


In the present case we have 
R,(w) = OSpsa-li, 


= O(w* log w), a-1<pSa, 

= Ow? log w), p2a, 

as x0, and yg‘)(x) =O(1) for x>0. The inversion of the repeated integral remains valid in the 
present case. 


asw—, Also 
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= — f (tw) 
0 


= 


(— 
I'(h + 2) 


* (me (u — “Ra(v)do 


(- 1 )h+2ph+2 
(u — 9) (tu)du 


J 


It has also been shown* that, for x>0, 8>a+121, 


A 
(25) 


It follows, by differentiation under the integral sign,f that 
1 w 
(t) = — —f wdra (=) if 
0 0 


Therefore, writing 


= f +i f =1,+ L2, 


ans 


|L:|s at f | vdva(v) | = O(1) 


| L,| < antes f | vdv,(v) | = O(1) 


for 8>a+1. This proves the theorem. 


* Bosanquet, 3. 
+ The resulting integral converges uniformly for /2«>0, when B>a+1. 


where 
we have 
and 
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Combining Theorems 1 and 4 together we now have the following theo- 
rem. 

THEOREM 5. A necessary and sufficient condition that nA,=O(1) (C) is 
that, for some x=0, ¢,(t) should be absolutely continuous except at t=O and 
i! (t) =O(1) in the interval (0, 7). 

There is an analogous result with o in place of O. 

We also add the following theorem. 


THEOREM 6. If a20 and 
1 @ 
(26) — f ra(u)| = 010) 
w Jo 


as then s(t) =O(1) in the interval (0, for B>a+1. 


There is an analogous result with o in place of O.* 

It follows from (26), by the analogue of Lemma 1, that (24) is satisfied, 
with a+1 in place of a, and >>A, is bounded (C). Hence é@f4:(¢) =O(1) for 
B>a+1, by Theorem 4, and ¢(#) =O(1) (C).f The result may now be ob- 
tained from (5). 

4. Finally we can generalize the Rieszian form of Theorem 5 as follows. 


THEOREM 7. If \ is a non-negative integer, a necessary and sufficient condi- 
tion that 


=) rg(w) = O(1) as 


for some B=X, is that 


d 
¢ = O(1) in the interval (0, 


for some where ¢,(t) is a Ath integral except at t=0. 
Suppose that for some a=) we have 
d 
(=) o.(t) = O(1) for 


Then if 0<y Si, we have 


* For the case a=0 see Bosanquet, 4. The hypothesis is satisfied in particular when }_A, is 
summable (C, a). See Hardy and Littlewood, 11, Paley, 18, Verblunsky, 19, Wiener, 21, 22, Bosan- 
quet, 3. 

1 Hardy and Littlewood, 11. 
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=)" = 04{ 1 0 
(4) 


and it is easily shown by induction that 


(27) =) = po f (=) 


for 8>a. If, on the other hand, for some a=) we have 


(« <) ra(w) = O(1) asw om, 


(« ra(w) = Of (log as w— ©, 


and we obtain, for 8>a+1, 


(28) B os(t) = (— (w (tu)du. 


The result follows from .(27) and (28) by arguments analogous to those of 
Theorems 2 and 4. 
The analogue of Theorem 7 for conjugate series is as follows.* 


THEOREM 8. [f \ is a non-negative integer, a necessary and sufficient condi- 
tion that 


d\> 
( = O(1) 
dw 


for some B=X, is that 


d 
= O(1) in the interval (0, x) 


for some k=, where 0,(t) is a th integral except at t=0. 


It is interesting to note that, by (6), if k=1,T 
t : ¥i.(t) 
dt 


and so, when \21, the condition in Theorem 8 takes the form 


* The previously known cases were \=0 or 1. Hardy and Littlewood, 12. 
+t When 0<<1 this is true except possibly in a set of measure zero. 
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t) = O(1 
= O(1). 
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A NEW METHOD FOR WARING THEOREMS WITH 
POLYNOMIAL SUMMANDS, II* 


BY 
L. E. DICKSON 


1. In a paperf with the same title, I showed how to deduce instantane- 
ously a Waring theorem for an even polynomial f(x) of degree 2m from a 
known Waring theorem for a polynomial g(x) of degree m. Here I extend the 
method to the new case in which f(x) contains also a term in x. 

2. First, let »=2 and 


(1) f(x) = uxt + vx? — wx + k, q(x) = ux? + vx + 2k. 

We have the identity in a, b, c, d, u,v, w, k 

(2) 69q(s) = >> f(z), 

in which z takes the following twelve values: 

(3) bta, cta, d+a, +b-—¢, +c-—d, 
whose sum is zero. Since some of the numbers (3) are negative, we impose 
the condition 


(4) f(x) is an integer = 0 for all integers x. 


But when x ranges over all integers (positive, negative, or zero), evidently 
f(—x) takes the same values as f(x). Without loss of generality we may there- 
fore take w20. Since f(—x) =f(x)+2wzx, f(—x) will be 20 for all integers 
x 20 if the same is true for f(x). Hence (4) follows from 


(5) f(x) is an integer = O for every integer x 2 0. 


Since u>0, only a limited number of integers x yield negative values of 
f(x) —k. If one of these values is —P, while all the remaining are = —P, 
then (5) holds if and only if k= P. In brief, we need only take & sufficiently 
large in (1). 

Consider triangular, pyramidal, and figurate numbers 


T(x) = (x? — x)/2, P(x) = (x* — x)/6, 


(9) F(x) = + 2)(x + 1)a(x% — 1)/24 = (at + — x? — 2x)/24. 


* Presented to the Society, November 29, 1935; received by the editors June 14, 1935. 
+ These Transactions, vol. 36 (1934), pp. 731-748. In (36) read +d for +d. In (38) delete ex- 
ponent 6. 
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Any quartic function with rational coefficients can evidently be expressed in 
the form 
(7) f=AF+ gP+hT+ix +k, 
where A, - - - , & are rational numbers. Let (5) hold. Taking x=0, - - - , 4, 
we see that 

kh, t+k, 

15A + 10g + 6h + kh 

must be integers. Hence k, t, A, g, h are integers. The coefficients of x* and x 
in (7) are 


A/12 + g/6, — (A/12 + g/6) — h/2+4. 


These must be 0 and <0 respectively if (7) shall be of the form (1) with 
w2=0O. Hence 
(8) A+2g=0, h2=2t. 


3. Of special interest are functions f for which 


(9) Every integer = 0 is a sum of V values of f(x) 


for integers x. The smaller A is, the more slowly will f increase with x, and 
the smaller V will be in general. Hence we give to A its minimum (even) 
value 2. Evidently (9) requires that 


(10) f(y) = 0, f(z) = 1 for certain integers y, z. 
The functions (7) satisfying (4), (8), and (10) and having A =2, t= —5, 
are found to be those with 
t 1 0 0-1 —-1-—-2 —2 —2 —3 
h |22 0 1 —1 i-2 —1 0 2-4 1 3 
k 0 0 0 2 1 6 4 3 2 16 ++ 3 


Hence each of these functions represents 0 and 1, and is an integer =0 for 
every integer x. The general theory therefore yields a value of V in (9) and 
hence a universal Waring theorem for summands f(x). 


t |-4 -4 -4 -4 -4 -5 -5 
h|}-3-1 0 2 4-6 3 5 
15 9 7 5 4 3 6 5 
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4. Consider, for example, the seventh set t= —2, h= —1, k=4. Then 
f(x) = — P— = (x4 — 7x2)/12 — 3x +4, 
Q = 69 = H + 42, H = 3(x — 3)(« — 4). 


Every integer =0 is a sum of three values of the triangular number H for 
integers x =>4. Thus every integer = 126 is a sum of three values of 0. Hence 
by (2), every integer = 126 is a sum of 36 values of f(x). We next verify this 
fact also for positive integers <126 and indicate a probable reduction from 
36 to 5. By a table to 1000 of sums of three values of f(x), we find that 415, 
734, and 749 are the only positive integers <1000 which are not sums of 
four values. We find at once that all integers <5114 are sums of five values 
of f(x) for integers x (positive, negative, or zero). 
5. Waring theorem for sextic polynomials. Let 
f(x) = ux® + vx4 + wx? — hx + k, 


12 
(12) q(x) = 120ux* + 720x? + 60wx + 108k. 


Then a like generalization of (38) of the former paper gives 

(13) g(s) = f(y) + + f(2a) + f(2d) + f(2c) + £(2d), 

where z ranges over the twelve values (3), and y ranges over the eight values 
(14) —-a—b-cid, +a—b+c-d, Fat+b—c-—d, 

+ (¢-—b-—c)+d, 


whose sum is —2a—2b—2c—2d. Hence the sum of the 108 arguments of f 
in (13) is zero. In case f(x) is an integer =O for every integer x (which is 
true when & is sufficiently large), a Waring theorem for g leads instantly to 
one for f. This condition (4) holds if f(x) is 
(15) x8 + x? — x, 3(x8 + 3x?) — x, 
each of which represents 0 and 1. Hence each yields at once a universal War- 
ing theorem. 

6. Quartics with property (4). Replacing x by —x—1 in (7), we get 
(16) AF (x) — gP(x) + (hk — g)T(x) + 
Hence (7) remains unaltered if and only if 
(17) g=0, h =t. 
In this case the values of f(x) for negative integers coincide with its values 
for integers x=>0. Such unfavorable cases are 


(18) f=F(x), F—-T—x+2, F—-6T—6x+ 56, 


q 
{ 
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whose values are 20 for an integer x=0 and hence for all integers. By tables 
to 1000, all integers from 0 to 3366 inclusive are sums of 7 values of F(x) ex- 
cept only 64, 99, 119, 189, 314, 774. Hence all £23841 are sums of 8. 

Since T7(—x) =T(«+1), F+T is =0 for all integers. By a table to 1000 
of sums by three, it was verified that all integers from 0 to 3900 are sums of 
four values of F+T. 

Miss H. Rees found that (7) has property (4) if 


1, g=—p-2, h=pt+i1+ gp(2pt+ 1) +m, 
p= 0, m= 0. 
To obtain an integer / take m to be the sum of an integer >Oand3if p=1o0r3 
(mod 6); 0if p=Oor4; 4 if p=2; gif p=5 (mod 6). We may remove the term 
involving P by the transformation x= y-+p+2. We get 
ag) P+ OT + {1 — 1+ m(p + + (b+ 2){1 + ( + IJ}, 
r=¢p(p+2), J= (p+ 1)(p — 12)/24+ p + 1). 


When p=1, m=}, (19) is f =F+2y+5. By a table of sums of three values 
from 0 to 3000 and from 9000 to 11000, it was found that every integer 
< 16151, except only 11784, is a sum of four (positive) values of (16) for 
positive and negative integers x. It follows that all <210739 are sums of five 


such values. 
Another favorable function f=F+y+2 is the case p=m=0 of (19); it 
represents 0, 1, 2, 3, 5, 10, 12, 21, etc. 


UNIVERSITY OF CHICAGO, 
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THE VOLUME OF THE FUNDAMENTAL DOMAIN 
FOR SOME INFINITE GROUPS* 


BY 
CARL LUDWIG SIEGEL 


Let Q be any region in m-dimensional euclidean space which is invariant 
under a group I of real homogeneous linear transformations of the coordi- 
nates. The group I has a fundamental domain F on Q if F is mapped by the 
different transformations of I into a set of domains which completely fill out 
Q without overlapping one another. It is obvious that then I is countable. 
If all the substitutions of the group have the determinant +1, the volume v 
of F is uniquely determined by Q and I. The reciprocal value of v is a certain 
measure for the order of I’; in fact, if I’, is a subgroup of I’ with the index g, 
the volume of the fundamental domain of I is exactly gv. 

It is known from the analytic theory of quadratic forms how to find » 
if is the group of automorphisms of a quadratic form with integer coeffi- 
cients. Minkowski, in his last investigations on the theory of numbers, de- 
termined the value of v in another case, which also has interesting appli- 
cations to the problem of the closest packing of m-dimensional spheres. Let 
p __15*!e%1 be any positive definite quadratic form of m variables and Q 
that part of the space of the m(m+1)/2 coefficients s.: (1Sk</<n) where 
the determinant | s,| is not greater than a fixed positive number g. By apply- 
ing any substitution x; =D cy with integer coefficients whose determi- 
nant is +1, a linear transformation of the s;: is induced which leaves Q 
invariant. The group I of these transformations of the quadratic form is ob- 
viously isomorphic to the factor-group of the group of all unimodular sub- 
stitutions of » variables with respect to the subgroup of order 2 generated 
by x. = (k=1,---,m). A fundamental domain of on Q is the region 
F of the reduced positive definite quadratic forms of ” variables whose de- 
terminant is not greater than g. Minkowski proved that F is bounded by a 
finite number of planes and the surface | s;:| =g. Moreover, he calculated ex- 
plicitly the volume of F as a function of m and q, namely, 


n+1 2 2 2 : 


where ¢(s) denotes the zeta function of Riemann. 


* Presented to the Society, October 26, 1935; received by the editors April 1, 1935. 
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The purpose of the present paper is to prove Minkowski’s formula (1) 
by a simple analytic method and to generalize it to the case of any algebraic 
number field. A special application gives the non-euclidean volume of the 
fundamental domain for the modular group in every totally real algebraic 
field. Blumenthal and Hecke have shown the importance of the corresponding 
modular functions for algebraic and arithmetic investigations. Since the 
knowledge of a set of generators of the modular group is necessary for the 
construction of any example in the theory of modular functions, the determi- 
nation of the volume of the fundamental domain can be useful for further 
researches. 

1. Let Qo be the space of all positive definite symmetric matrices ¥ of n 
rows, and F, its fundamental domain for the group of all transformations 
&’¥€ where € is any unimodular matrix of m rows and G@’ its transposed. The 
trace of ¥ is denoted by o(X), the determinant of ¥ by | ¥|, and dX is the 
3n(n+1)-dimensional volume element in Qo. The formula 


k=0 2 Qo 

holds for every s with positive real part and can be proved by complete induc- 

tion, starting with Euler’s definition of the gamma function. Let & be any 

real matrix of » rows and columns, whose determinant is not zero. Then 

%’¥° can be substituted for ¥ in (2). Hence 


(3) $(s) | /2 | | WAM) gy 
Qo 


where ¢(s) is an abbreviation for the left side of (2). 

If ¥ runs over the fundamental domain Fy and € over all unimodular 
matrices of rows, the matrices ©’¥€=(—€)’¥(—€) completely fill out 
twice the space Qo. Therefore (3) can be transformed into the equation 


(4) 26(s) | |—(etn—1)/2 = | | 9/2—1 Lge 
Fo 


A matrix & is called associated to A, if 8 =CA with unimodular €. In 
(4), the matrix €% runs over all associates to Y%. It is clear that the determi- 
nants of all associates have the same absolute value. There exist only a finite 
number of non-associate integer matrices 91 whose determinants have a fixed 
absolute value a0. In fact, Eisenstein has proved that their number is 
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where qi, - - - , @, run over all systems of solutions of a; - - - a, =a in positive 
integers. 

Let the real part of s be greater than 1 and sum (4) over a complete system 
of non-associated integer matrices 2 whose determinants are different from 
zero. Since 


D = + 1)--- e(s+n—1), 
a=1 
the result is 
(6) + 1)--- = f | 
Fo 


where 2% runs over all integer matrices with | %| #0. 
The left side of (6) is a meromorphic function of s which has a pole of 
first order at s=1. The residue at this pole is 


(7) p= (—)r(—) ¢(2)--+ &(n). 
: 2 2 2 


To study the behavior of the right side of (6) near s=1, the well known 
method from the theory of the zeta functions can be used. Divide Fo into 
two parts F; and F:, corresponding to | ¥| <1 and |X| >1. The integral in 
(6) then splits up into the sum of the two integrals over F; and F2. The second 
integral is an integral function of s. Furthermore the function 


Fy 


where % runs over all integer matrices with | {| =0, is regular near s=1. 
Hence p is also the residue of 


f | EW dX , 
Fy 


where % runs over all integer matrices, at the point s=1. Now, from the 
theory of theta functions, the formula 


a x 


is known. Hence p is the residue of 


(8) | | /2-1g¥ + f | | (e—m)/2—1 
Fy Fy 


= 
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where % runs over all integer matrices except the zero matrix. The second 
integral in (8) is again regular at s=1, and p is the residue of the first integral 
in (8). 

If v7; is the volume of F;, the fundamental domain F, which is the part 
|X| <q of Fo, has the volume 2,9‘"+»/*, Hence 


PF; 2 0 s-—1 
(9) (n + 1)01, 

and (1) follows from (7) and (9). 

2. The group of the matrices © with integer rational elements and the 
determinant +1 has a generalization in any algebraic number field. It con- 
sists of all matrices € of m rows, for which the elements of € and €- are in- 
tegers of the field K. These matrices will be called unimodular in K. Their 
determinants are units of the field K. The definition of the associates of a 
matrix can be at once extended to the case of any K: the matrix & is asso- 
ciated to A, if 8 = CA with a unimodular € in K. Since the determinant of 
an associate of % can only differ from the determinant | %| =a by a factor 
which is a unit of K, the determinants of all associates to an integer matrix 
of K define the same principal ideal (a). Eisenstein’s result (5) has been gen- 
eralized by Hurwitz. He proved that the number of non-associate integer 
matrices % of K with m rows, whose determinants a +0 define the same prin- 
cipal ideal (a), is 
(10) ¥(a) = - - - 


where the symbol N denotes the norm and ai, - - - , a, run over all systems 
of solutions of a; - - - a, =(a) in integer ideals a, - - - , Qn. 

In this section only the simpler case of a totally real field K will be in- 
vestigated. If / is the degree of K, the / conjugates of any matrix % with ele- 
ments of K will be denoted by %, --- , %. Let %:, -- - , ¥: be any / positive 
definite symmetric matrices of m rows, Q» the space of their 3u(m+1)1 coeffi- 
cients, and Q the part of Q» defined by the inequality | %: - - - ¥:| <q. If Cis 
unimodular, the transformation €/ ¥,Gi, - - - , €/ ¥,€, leaves Q invariant. The 
problem is to prove the existence of a fundamental domain F on Q with re- 
spect to these transformations, which is bounded by a finite number of planes 
and the surface | %: - - - ¥,| =qg and to calculate the volume of F. The first 
part of the problem requires the theory of reduction of positive definite quad- 
ratic forms in K and can be solved without serious difficulty by generalizing 
Minkowski’s ideas. Here only the solution of the second part, the determina- 
tion of the volume 2 of F by analytic methods, will be explained in detail. 
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From (3) there follows for every integer matrix YU of K with m rows, whose 
determinant does not vanish, the equation 


Qo 
here N(%) denotes the product of the determinants of %,---, %, and 
the sum of the traces of W/Xi%,---, If Fo is a funda- 
mental domain on Qo, the right side of (11) can be transformed in analogy 
to (4). By summing (11) over a complete system of non-associated integer 
matrices of K with |%| the equation 


(12) 26"(s) >> f N(%)#/2-1 >» (MEW - d¥, 
(a) Fo <0 
arises, when (a) runs over all integer principal ideals and % over all integer 
matrices with | {| #0; the real part of s must be greater than 1. 
If 4 is the class-number of K, there exist exactly / different characters 
x(a) of the class-group. The sum >>,x(a) is h, if a is a principal ideal, and 0 
otherwise. Let 


fx(s) = Dox(a) Nam 


denote Dedekind’s zeta function with class-characters. Then, by (10), 


(13) (a) = + — 1). 


(@) x 
Now it is known that ¢,(s) is an integral function if x is not the principal 
character. For the principal character, ¢,(s) is the function 


=> Na-*, 
a 


which is regular for s~1 and has at s=1 a pole of first order with the residue 
2! D-/?R h, where R and D are regulator and discriminant of K. Hence the 
residue of the left side of (12) at s=1 is 


k 


k==1 


The calculation of the residue of the right side of (12) is quite analogous 
to the rational case. The domain F, is divided into the two parts N(X) <1 
and V(X) >1 and for the first part of F, the theta formula 
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is used; here } denotes the fundamental ideal of K, the matrix % runs over 
all integer matrices of K and % over all matrices whose elements belong to 
the ideal )-'. In this manner it can be seen that 


(15) p = (n+ 1)D-""y,, 


where 2,9‘"*/2 is the volume of F. 
Hence, by (14) and (15), the volume of the fundamental domain F is 


2 1 n 
( ) a+ 1 q ) 


and this is the generalization of Minkowski’s formula (1) to the case of any 
totally real algebraic number field. 

3. If some of the conjugates of K are imaginary, let 272 be their number 
and 7; the number of the real conjugates. Then 7,;+27r2.=/. For any matrix 
with elements of K, the conjugates in the real fields will be denoted by 
%1,---, and the conjugates in the imaginary fields by %,,4:,---, 
moreover A, and (k=n+1,---, 71 +72) shall be conjugate complex. 
Put r:+72= p. Instead of the / positive definite symmetric matrices £,, - - - , ¥; 
of the totally real case, r; positive definite symmetric matrices %,--- , X,, 
and r2 positive definite Hermitian matrices ¥,,4:, - - - , ¥, must be considered. 
The elements of %:, - - - , ¥, define a space of n(m+1)r:/2+r. real dimen- 
sions. Let Q be the part of Qo, where | Xi - - - ¥-,¥?,41-- + ¥,2| <q. If € de- 
notes the conjugate complex to G, the transformation €/ ¥.€, (k=1, - - - , p) 
leaves Q invariant for any unimodular matrix € of the field K. The existence 
of a fundamental domain F on Q for the group of these transformations, which 
is bounded by a finite number of analytic surfaces, is known for the case of 
an imaginary quadratic field K by the investigations of Picard and Bianchi. 
The proof can be extended to the case of any K. 

For the calculation of the volume 2 of F, an analogue of (2) for the space 
H of the positive definite Hermitian matrices ¥ must be considered. If y. 
and z,, denote real and imaginary parts of the element x, of ¥, the volume 
element dX of H will be defined by 


d¥ = = JT] dya den. 


A=1 1S«Sd\Sn 1Sn<ASn 


Then the analogue of (2) is the formula 


n—l 
II + k) - f | 
H 


k=0 
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If (k=p+1, - - - , 1) denotes the conjugate complex ¥,_,, to Xx_,,, the gen- 
eralization of (11) is 


k=0 k=0 


Qo | | | 
Since the equations €/ ¥,€,=%X, (k=1, ---, p) only hold identically in X, 
for a unimodular € of K, if €=w€, where € is the unit-matrix and w a root 
of unity, the matrices C/ ¥,€, (k=1, - - - , p) completely fill out Qo exactly 
w times, if %1,---, ¥» run over the fundamental domain Fy and € over all 
unimodular matrices; here w denotes the number of roots of unity in K. 
Hence corresponding to (12) and (13) 


n—l 


h m0 


&x(s)ix(s + 1) +++ &(s +m — 1) 
x 
dX; dX» 
frond, 


and by calculating the residues at s=1 on both sides, 


2 


n+1 2 


(17) 
F 


where D is the absolute value of the discriminant of K and R the regulator. 
Therefore (17) gives the volume of the fundamental domain if the volume 
element is defined by | X,,41 - - - ¥p|d@%: - - - dX, which is invariant under uni- 
modular transformation. 

4. The special case ” = 2 is closely connected with the theory of the modu- 
lar group in any totally real algebraic field K. Let 71, - - - , 7: be a set of vari- 
ables in the upper half-plane. The modular group in K consists of all the 
substitutions 


ante + Bx 
+ 


= 


for which a, 8, y, 6 are integers of K and a5—(y is a totally positive unit. If e 
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is any unit of K, then ea, €6, ey, €6 define of course the same modular sub- 
stitution as a, B, y, 6. 

Blumenthal has proved the existence of a fundamental domain G» of the 
modular group in the space of the upper half-planes of the / complex variables 
(k=1,---, 1). The domain is bounded by a finite number of 
algebraic surfaces. Since the special modular substitutions with a6 —By=1 
form a subgroup of finite index m, they possess also a fundamental domain G. 
The non-euclidean volume 


v= 
G 


Ue up 


of G, and hence also the corresponding volume m-'V of Go, can be found in 
the following manner. 
Formula (16) gives the volume 


as) 
PF 


for the space of the reduced systems of positive definite symmetric matrices 
Xi, X, of m rows with - - - ¥:| <q. Let m have the value 2 and con- 
sider instead of the group of all unimodular matrices (4) only the subgroup 
for which ad —Sy =e’ is a square of a unit of K. Since the index is 2', the 
corresponding volume in the space of X,, - - - , X, is 2'v. By the substitutions 


**) Ye + (yi? — 
Xx = 


= = — ye 
Zk 


Ye 3k 


the equation (18) is transformed into 


dt\duy, dt,du; 


uy up 


where the variables 4, - - - , w, run over G and &, - - - , £: over a fundamental 
region in £, - - - £;<q with respect to the group £’ = e*é formed by the fourth 
powers of all units of K. Hence 


2" = 422-1 
and by (16) 
(19) V = (2). 


Since it can be shown that 2-*!D"/*¢,(2) is rational for totally real K, the 


(k=1,---,D 
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number 7~'V is rational also, corresponding to a theorem of Dehn and Poin- 
caré on the volume of a non-euclidean polyhedron in an even number of 
dimensions. 

5. The very simple result (19) can also be proved by another method, 
which does not use the properties of the units and of the zeta functions of K. 
Let m, - - - , 7. be any positive numbers, » and v two numbers of K, not both 
0, and the ideal (u, v) =a. Consider the integral 


nidt,duy nidt;du, 
(20) = 2 2 2 
w»=a YG (n1| Maiti + v1 | + Mitt + | + 


where yp, v run over all pairs with the greatest common divisor a. If a, 8, y, 6 
run over all integers with a6 —fPy =1, then wa+vy, uB@+vé run over all pairs 
which have the same greatest common divisor as the fixed numbers p, v 
Moreover, if in particular pwa+vry =u, u8+vd =v, then the integrand in (20) 
is invariant under the modular substitution (ar +8)/(yr+6). Hence J(a) is 
exactly twice the value of the integral with the same integrand and fixed y, v 
extended over a fundamental region for the subgroup I of the modular sub- 
stitutions with ywa+vy u8+vd=v. Choose now two numbers «, A of the 
ideal such that x» —Auw=1, and make the substitution 


A simple calculation shows that, for all elements (*4) of I’, the equation 


COGIC.) 


holds, where ¢ belongs to a~*; on the other hand, if ¢ belongs to 
a~?, then (%%) is an element of T. Let w:,---, w; be a basis of a~® and 
£=w,%,+ -- + +w,x,. Then a fundamental region of I is defined by the in- 
equalities 


Osu,<1i,u,>0 


Hence 


dx,du 
o (me + (ne + mx)? 


and, summing (20) over all integer ideals a, 


(m| wits + |? + 101)? (mi | + vr |? + my)? 


‘ kT +X 
(k =1,---,J). 
| 
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where y, v run over all pairs of integers different from 0, 0. If m, - - - , 7: tend 
to zero, the right side of (21) becomes 


l +0 +00 d d 
(22) f f dtsdus), 


where yu, v, are variables of integration. Now, by the substitutions 


t u 
| | 


| 


rf dudv ff 
(| ur + + x)? 0 u(r+ un)? 


and therefore the expression (22) has the value 


dtyd 


Ue 


Together with (21), this completes the second proof of (19). 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


= r'/2 sin 
= ru-?, 


MITTELWERTE ARITHMETISCHER FUNKTIONEN 
IN ZAHLKORPERN* 


VON 
CARL LUDWIG SIEGEL 


Es sei f(£) eine Funktion, die fiir alle ganzen Zahlen eines total reellen 
algebraischen Zahlkérpers K vom mten Grade definiert ist. Man ordne — den 
Punkt im -dimensionalen Raume zu, dessen Coordinaten die m Conjugierten 
von £ sind. Fiir irgend ein Gebiet G jenes Raumes bilde man die Summe 
(1) F= 

Ein G 

in der £ alle ganzen Zahlen von K durchlaufe, fiir welche der zugeordnete 
Punkt é in G liegt. Bei einigen Untersuchungen in der Zahlentheorie ist es 
nétig, eine asymptotische Annéherung von F zu finden, wenn G in gewisser 
Weise unendlich wird. Hierbei kann die im folgenden hergeleitete Formel von 
Nutzen sein, die eine Art Verallgemeinerung der bekannten Formel fiir die 
Summe der Coefficienten einer Dirichletschen Reihe ist. Ihre Anwendung 
wird dann an dem Beispiel des Teilerproblems erliutert. 

Zwecks einfacherer Darstellung soll nur der Specialfall des reellen qua- 
dratischen Kérpers behandelt werden. Die Ubertragung auf den Fall eines 
total reellen K6rpers beliebigen Grades bietet keine gedanklichen Schwierig- 
keiten. 


1. Dre SUMMENFORMEL 


Weiterhin sollen nur solche arithmetischen Funktionen f(#) betrachtet 
werden, welche die folgenden beiden Eigenschaften besitzen: Es soll in K eine 
Einheit e~ +1 geben, so dass f(e#) =f(£) ist; ferner soll, wenn £’ die Con- 
jugierte von & bedeutet, fiir eine geeignete positive Constante c und £0 die 
Funktion f(£)| £’|-* beschrankt sein. Offenbar darf man e>1, e’ >0 voraus- 
setzen. Zwei Zahlené,y aus K mégen associiert heissen, wenn £y—' eine Potenz 
von ¢ ist. Man lasse nun £é ein volles System nicht-associierter ganzer total 
positiver Zahlen durchlaufen und bilde die Dirichletschen Reihen 


(2) be(s) = Do’ (k=0,+1,+2,---). 


Sie sind in der Halbebene o >c+1 absolut convergent. 
Zunichst werde fiir das Gebiet G das Rechteck 0<éx<1, 0<£’x’<1 


* Presented to the Society, October 26, 1935; received by the editors March 25, 1935. 
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genommen, wo x und x’ positive Zahlen sind, und fiir irgend ein r>1 die 
Summe (1) mit statt f() betrachtet. Fiir die 
Summe 


g(x, x’) = — Ex) — 
0<t’ 


gilt dann 
Satz 1. Es ist 


1 bind x \ Tik/ loge 
2mi log € o—wi x 


s— ) B s+ 
log log 


falls ¢>c+1 ist und das Zeichen B die Eulersche Betafunktion bedeutet. 


Aus den bekannten Eigenschaften der Betafunktion folgt, dass das In- 
tegral in (3) als Funktion von k die Grdssenordnung von | k|~-* hat. Die un- 
endliche Reihe in (3) ist daher absolut convergent. Setzt man 


(3) 


x = ue’, x = ue’, 


so wird die rechte Seite von (3) eine trigonometrische Reihe in bezug auf 
die Variable v mit der Periode 1. Daher hat man nur noch nachzurechnen, 
dass der Fouriersche Coefficient 


1 
a,(u) g(ue’, ue”) 
0 


mit dem Coefficienten vom e?*‘** auf der rechten Seite von (3) itibereinstimmt. 
Bedeutet der Strich am Summenzeichen, dass nur iiber nicht associierte £ 
summiert wird, so ist 


wik/loge 
aw =f (=) D’ — — 


Fiir o>c+1 gilt demnach 


f u**—la,(u)du 
0 


1 '\ rik/loge 
= J (xx’)s-! (=) tx) "(1 


2z’<l 


2 log 


z 
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~ 9 log 
1 wt 
B(r, s- ) s+ ox(s). 
2 loge log log 
Hieraus folgt vermége der Mellinschen Umkehrformel die Beziehung 


1 ores rik 
a,(u) = — f u-~**B ) B s+ ox(s)ds 
2mi log log log 


und damit die Behauptung. 
Fiir den speciellen Fall r=2 wird 


a’) = ff 
0 0 


0<’<v’ 
Setzt man zur Abkiirzung 
nik 
(4) = v’), St = Sk 
O<t<v log log € 


so gilt fiir positive y, yy’ 
SATz 2. 


J F(x + v, x’ + v’)dvdv’ 


(x + y) — (x’ + y +1 
= f ox(s)ds 
+ 1) si (sé + 1) 


In Analogie zur Formel fiir die Coefficientensumme einer Dirichletschen 
Reihe ist (5) dann niitzlich, wenn die Funktionen ¢;(s) iiber die Halbebene 
absoluter Convergenz der Reihen (2) hinaus fortsetzbar sind. Dann liefert 
namlich vielfach der Residuensatz einen asymptotischen Ausdruck fiir die 
rechte Seite von (5). 

Von (5) ausgehend kann man in verschiedener Weise zu einem Naherungs- 
wert fiir die in (1) definierte Summe F gelangen. Ist insbesondere f{(£) nicht- 
negativ, so ist nach (4) 


F(x, S F(x +, x + 0’) S F(x + y, + y’) 


fiir 0O<v<y, O<v’<y’, und daher 


ime 


1 y’ 
(6) F(x,x’)s —f f F(x + v, x + v')dvdv’ S F(x + y, x’ + y’). 
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Eine asymptotische Abschitzung des mittleren Gliedes dieser Ungleichung 
fiihrt also auch zu einer Annaherung von F(x, x’). Nun ist F(x, x’) der 
Specialfall der allgemeineren Summe F, in welchem das Gebiet G das Rech- 
teck 0<i<x, 0<é&’<x’ bedeutet. Um zu einer Aussage iiber F selbst zu 
kommen, hat man noch G durch Addition und Subtraction jener speciellen 
Rechtecke anzuniéhern. 

Man kann iibrigens auch eine explicite Formel fiir F angeben. Es gilt 
namlich 


1 baad 
(7) F = >» f ox(s) (ff ds, 
2mi log € Yo G 


—ot 


falls G von einer differentiierbaren Curve im ersten Quadranten begrenzt 
wird, die durch kein ganzzahliges £ geht. Da aber fiir die Anwendungen Satz 2 
viel praktischer ist, so sei auf den etwas miihsamen Beweis von (7) verzichtet. 


2. DAs TEILERPROBLEM 


Einige Beispiele von arithmetischen Funktionen, deren Mittelwerte unter 
Benutzung von Satz 2 berechnet werden kénnen, sind (1) Anzah! der nicht 
associierten total positiven Teiler von £; (2) Anzahl der Idealteiler von &; 
(3) Anzahl der Zerlegungen von & in zwei ganze Quadratzahlen aus K; 
(4) Restclassencharakter; (5) e?**8®, wo S die Spur bedeutet und vy irgend 
eine gebrochene Zahl aus K ist; (6) 1 oder 0, je nachdem & Primzahl ist oder 
nicht. 

Es soll hier nur das erste Beispiel behandelt werden. Es sei ¢ die Funda- 
mentaleinheit der total positiven Einheiten von K, und d die Discriminante 
von K. Setzt man 


wo £ alle nicht-associierten ganzen total positiven Zahlen durchlaufe, so geht 
(2) iiber in die Gleichung 


ox(s) = 


Wie Hecke gezeigt hat, ist ¢,(s) fiir 0 eine ganze Funktion von s. Ferner 
ist auch die Funktion £o(s) — (log €/d'/)(s—1)-1 ganz; bei s=1 habe sie den 
Wert y, der iibrigens nach Hecke und Herglotz unter Benutzung der 
Kroneckerschen Grenzformel berechnet werden kann. Da die Reihe (8) 
fiir ¢>1 absolut convergiert, so kann o in (5) fiir den vorliegenden Fall 
irgend eine Zahl >1 bedeuten. Aus dem von Hecke niher untersuchten Ver- 
halten von ¢;(s) in kritischen Streifen 0<o <1 geht hervor, dass die Integra- 
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tion iiber s in (5) auch auf einer beliebigen Geraden des kritischen Streifens 
erfolgen kann, wenn noch das Residuum der Funktion 


s(s + 1) s(s + 1) 
bei s=1 beriicksichtigt wird. Wegen 


s(s + 1) s(s + 1) 


y y’ z+v 
= f f ( f f (uy'dudw) dvdv’ 
0 0 0 0 


hat das Residuum den Wert 
7] lo 
(== log (ux 7) duds’) dvdv’ . 


Da nun ferner die Funktion 
(a—1)/2 (o—1)/2 
€ 


fiir s=o+4i gleichmissig in k beschrinkt ist, so kann man die Integrale auf 
der rechten Seite von (5) abschatzen und erhalt 


F(x + 0, x’ + v')dodv’ 


+ O((% + + 


(1 


log 


fiir jedes o >0. Die giinstigste Wahl von y, y’ ist 
£. 


aus (6) und (9) folgt dann 


(10) F(x, x’) 


fiir jedes 6>0. 


= 


log (uu’) + dudu’ + 


Nun sei R irgend ein Rechteck a<x<b, a’<y<b’ im ersten Quadranten 


ive 
i 
3 
4 
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und r(£) die Anzahl der nicht associierten total positiven Teiler von &. Nach 
(4) und (10) gilt dann 


(11) 7(é) = Sf. : log (uu’) + =) dudu’ + O((bb’)2/3+8), 


Diese Formel liasst sich leicht auf allgemeinere Bereiche iibertragen. Es liege 
G ganz im ersten Quadranten und habe den Inhalt J. Enthalt dann G den 
Punkt  =1 und ist der Umfang von G héchstens von der Gréssenordnung J, 
wo a@ eine feste positive Zahl <3/5 bedeutet, so ist nach (11) 


lo 2y 
r(é) = log (uu’) + =) dudu’ + o(J). 


INSTITUTE FOR ADVANCED STupy, 
PRINCETON, N. J. 


UBER DIE ALGEBRAISCHEN INTEGRALE DES 
RESTRINGIERTEN DREIKORPERPROBLEMS*}{ 


VON 
CARL LUDWIG SIEGEL 


Unter den wenigen allgemeinen Erkenntnissen, die man bei der Unter- 
suchung des Dreikérperproblems in den letzten Jahrzehnten gewonnen hat, 
ist der Satz von Bruns trotz seines negativen Charakters von Interesse. Er 
besagt, dass durch die bekannten 10 Integrale, nimlich die 6 Schwerpunkts- 
integrale, die 3 Flachenintegrale und das Energieintegral, simtliche algebra- 
ischen Integrale des Problems erschépft sind, oder in anderer Ausdrucksweise, 
dass der K6rper derjenigen algebraischen Funktionen der Zeit und der 9 
rechtwinkligen Coordinaten der drei Massenpunkte und der 9 Ableitungen 
dieser Coordinaten nach der Zeit, welche auf jeder Bahncurve constant sind, 
} von genau 10 Variabeln abhingig ist. 

Im folgenden soll ein analoger Satz fiir das restringierte Dreik6érperpro- 
blem bewiesen werden, also fiir den Grenzfall des allgemeinen Dreikérper- 

problems, bei dem die Bewegung in einer Ebene stattfindet, zwei K6rper 

eine Kreisbahn um ihren gemeinsamen Schwerpunkt beschreiben und der 
dritte Kérper die Masse 0 besitzt. Wahlt man in der Ebene der drei K6rper 
ein rechtwinkliges cartesisches Coordinatensystem, dessen Mittelpunkt der 
Schwerpunkt ist, und das sich so um diesen Schwerpunkt dreht, dass die 
beiden ersten K6érper in bezug auf das Coordinatensystem in Ruhe sind, so 
lauten die Differentialgleichungen fiir die Bewegung des dritten K6rpers 


(1) 
dabei ist 
V = wi(dr? + + 17), 
re =(x tm)? +s, wtm=1, 


Die Schwerpunktsintegrale und die Flichenintegrale des allgemeinen Drei- 
kérperproblems fallen im restringierten Problem fort. An die Stelle des 
Energieintegrals tritt das Jacobische Integral 


(2) x? + y? — 2V = constans. 
Ein algebraisches Integral von (1) ist nun eine solche algebraische Funktion 


* Presented to the Society, October 26, 1935; received by the editors March 19, 1935. 
7 Paul Epstein gewidmet. 
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f(x, y, x, 9, t) der Ortscoordinaten x, y, der Geschwindigkeitscoordinaten %, 7, 
und der Zeit ¢, welche auf grund von (1) constant ist; d.h. es muss der Aus- 
druck 


+ (29+ (— 2+ fe 


identisch in seinen 5 Argumenten verschwinden. Bedeutet ¢(z) eine alge- 
braische Funktion einer Variabeln z, so ist ¢(%?+-3?—2V) nach (2) ein alge- 
braisches Integral von (1). Es soll bewiesen werden, dass jedes algebraische 
Integral von (1) diese Form hat. 

Dieser Satz ist nicht im Resultat von Bruns als specieller Fall enthalten, 
denn durch die Beschrinkung der Freiheitsgrade kénnte ja gerade das Auf- 
treten neuer algebraischer Integrale erméglicht werden. Die Specialisierung 
bringt es vielmehr mit sich, dass der Brunssche Beweis nicht okne weiteres 
iibertragen werden kann; insbesondere sind auch einige Schwierigkeiten zu 
iiberwinden, die von der Bewegung des Coordinatensystems herriihren. 

1. Es sei 2 der KGrper aller complexen Zahlen und (4, ¥, x, y, 7, 71, ¢) der 
durch Adjunction von x, y, 7, 1, zu Q entstehende KGrper. Bedeutet f 
ein algebraisches Integral des restringierten Dreikérperproblems, so sei 


(3) +--+ +a,=0 


die in Q(%, 9, x, y, 7, 71, t) irreducible Gleichung fiir f, in der also a, - - - , a, 
rationale Funktionen von x, 9, x, y, 7, 71, # sind. Durch totale Differentiation 
nach ¢ folgt, dass die Gleichung 


(4) +6, =0 


identisch in 2, 9, x, y, ¢ gilt, wenn in a; (k=1, - - - ,m) die zweiten Ableitungen 
#, § vermége (1) eliminiert werden. Da aber dann a, wieder dem Ké6rper 
9, x, y, 7, m1, 2) angehGrt, so lieferte (4) eine algebraische Gleichung 
(nm —1)ten Grades fiir f in diesem K6rper, wenn nicht alle Coefficienten 0 sind. 
Da (3) die Gleichung niedrigsten Grades fiir f ist, so verschwinden also alle 
a, identisch in x, ¥,x, y, t. Folglich ist jedes a; selbst ein algebraisches Integral. 
Zum Beweise des behaupteten Satzes hat man daher nur zu zeigen, dass jedes 
dem Kérper Q(%, 9, x, y, m, 4) angehérige Integral eine rationale Funktion 
der einzigen Variabeln *?+ y?—2V ist. 

2. Das Integral f sei eine rationale Funktion von %, 9, x, y, r, 71, t. Man 
setze 


g(t) 
h(t) 


wo ¢ nicht von ¢ abhingt und g(t) =i"+ ---, h(é)=i"+ --- zwei in bezug 
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auf die Variable ¢ teilerfremde Polynome bedeuten, deren Coefficienten ra- 
tionale Funktionen von %, ¥, x, y, r, 7: sind. Die Gleichung 

(5) —+—-—=0 

c g h 
gilt identisch in %, ¥, x, y, 4, wenn #, § nach (1) eliminiert werden. Da ¢ in 
(1) nicht explicit auftritt, so ist ¢:¢ von ¢ frei und g:g, h:h sind echt ge- 
brochene Funktionen von ¢ mit teilerfremden Nennern. Also verschwindet in 
(5) jeder einzelne der 3 Briiche, und c, g, # sind einzeln Integrale. Man hat 


demnach nur die in bezug auf ¢ ganzen Integrale zu bestimmen. 
3. Das Integral f habe die Form 


wo bo, - --, b» rationale Funktionen von x, 7, x, y, 7, 71 sind, von denen bo 
nicht identisch verschwindet. Es soll bewiesen werden, dass m=O ist und 
folglich ¢ in f nicht explicit auftritt. Dies ergibt sich am einfachsten aus dem 
Wiederkehrsatz von Poincaré. 

Man wihle namlich fiir die Constante des Jacobischen Integrales (2) 
einen solchen Wert 7, dass die Hillsche Curve 


2V+7=0 


in der x, y-Ebene aus 3 Ovalen besteht, von denen 2 je einen der Punkte yp, 0 
und —y, 0 enthalten, wihrend das dritte die beiden andern umschliesst. 
Ferner sei y noch so bestimmt, dass nicht in allen Punkten des Gebildes 


(7) e+ 


eine der Funktionen do, dy}, - - - , ba verschwindet. Man kann dann einen 
Punkt x=, y=vyo in einem der beiden ersten Ovale und dazu ein die Glei- 
chung (7) erfiillendes Paar *=2%9, j= so finden, dass die Funktionen 
bo, - - - , fiir xo, Yo, Xo, Ho Sdmtlich von 0 verschieden sind. Anderer- 
seits gibt es nach dem Wiederkehrsatz zu jeder beliebig kleinen Umgebung 
von Xo, Yo, %o, ¥o Bahncurven, die mindestens zweimal in diese Umgebung 
eintreten, und zwar zu Zeitpunkten, deren Differenz oberhalb einer beliebig 
grossen Schranke gewahlt werden kann. Aus (6) wiirde aber folgen, dass jene 
Differenz beschrankt wire, falls m>0 ist. 

Es ist vielleicht methodisch unbefriedigend, den analytisch-arithmetischen 
Wiederkehrsatz heranzuziehen zum Beweise der rein algebraischen Tatsache, 
dass f von ¢ unabhingig ist. Man kann dies auch algebraisch zeigen, aber, 
wie es scheint, nur durch compliciertere Schliisse. 


(6) f = bt™+---+bn (m 2 0), 
j 


228 C. L. SIEGEL [March 


4. Auf grund des Ergebnisses des letzten Paragraphen hat man sich nur 
noch mit der Aufsuchung derjenigen Integrale zu beschaftigen, welche ra- 
tionale Funktionen von x, ¥, x, y, 7, 7: allein sind. Es sei f=g:h, wo g und h 
zwei teilerfremde Polynome der beiden Variabeln sind, deren Coeffi- 
cienten in Q(x, y, 7, 71) liegen. Die Gleichung 


(8) gh = hg 


gilt identisch in x, 9, x, y, wenn daraus #, #7 vermége (1) entfernt werden. 
Nun ist 


(9) & = t+ (29 + + (— 2% + 
ein Polynom in x, 3, dessen Grad in diesen Variabeln héchstens um 1 grésser 
ist als der von g, und dessen Coefficienten zu Q(x, y, r, 71) gehéren. Anderer- 


seits ist g nach (8) ein Teiler von g. Daraus folgt identisch in #, j, x, y die 
Gleichung 


(10) g = g(ux+ v9 + w), 


wobei u, 7, win Q(x, y, r, 71) liegen. 
Es sei, nach fallenden Potenzen von x geordnet, 


G= eee + 


das Aggregat der Glieder von g, welche in x, y héchste Dimension haben; und 
es sei ab nicht identisch 0. Aus (9) und (10) folgt dann 


(11) az by 
a b 
az by 
(12) + yG, = + 
a 


Die ganzen Gréssen von Q(x, y, r, 7:1) haben die Form ¢.+cr+c¢sn+crn, 
WO C1, C2, C3, Cs Polynome in x, y bedeuten. Da das Paar g, / nur bis auf einen 
gemeinsamen Faktor aus Q(x, y, 7, 71) bestimmt ist, so kann man voraus- 
setzen, dass a eine der 4 Formen (,, Cor, car, carr besitzt, dass ferner die Coeffi- 
cienten a,---+, 6 von G simtlich ganz sind und weder r noch 7 noch ein 
Polynom in x, y als gemeinsamen Teiler haben. Aus (12) folgt nun aber, dass 
jedes in Q(x, y) irreducible Polynom von x und y, das in a aufgeht, zugleich 
in allen Coefficienten von G aufgeht, und dasselbe gilt fiir die Teilbarkeit 
durch r oder 7;. Also ist a eine Constante. Indem man die Bedeutung von x 
und y vertauscht, erkennt man, dass auch 6 constant ist. 

Nach (8), (10) und (11) hat man nur noch diejenigen Polynome g der 
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beiden Variabeln x, 7 mit Coefficienten aus Q(x, y, r, 71) zu ermitteln, welche 
der Differentialgleichung 
(13) “= wg 
geniigen, wobei w in Q(z, y, r, r:) liegt. 
5. Es soll nun gezeigt werden, dass die Funktion w in (13) eine Constante 
ist. Es sei 
(14) g = + gia) eee gO, 
wo g fiir k=0,---, m ein homogenes Polynom kter Dimension in #, ¥ 


bedeutet, dessen Coefficienten in Q(x, y, r, 7:1) liegen; und zwar sei g‘” nicht 
identisch 0. Aus (9), (13), (14) folgen die Gleichungen 


(m) (m) 


(15) tge +g, =0, 
(16) age” + gy" + — = wg’, 
(17) age + + — + Vege + = 
(k =1,---,m-—1), 
(18) Vise + = wg” 


Von diesen besagt (15), dass g‘” eine Funktion der drei Variabeln x, 9, xy — yx 
allein ist. Da aber andererseits g‘” ein Polynom der Variabeln x, » mit Coeffi- 
cienten aus Q(x, y, 7, 71) ist, so ist ein Polynom in %, xy—y2. 

Es sei x = é ein endlicher Pol der Ordnung #21 von w als Funktion von x. 
Zunichst sei  verschieden von und —wi+yi. Da g‘™ ein primitives 
Polynom in x, y ist, so wird auch wg‘ bei x= von genau /Ater Ordnung 
unendlich. Ist g("-» =c(x—£)*+ - - - die Entwicklung von g‘"~” nach stei- 
genden Potenzen von x—§&, so ist 


und nach (16) erhalt man s—1= —h, s+0. Folglich ist h>1, und g‘"-» hat 
bei x=£ einen Pol der Ordnung 4—1. Ist bereits bewiesen, dass g‘"~” bei 
x =£ einen Pol der Ordnung /(4—1) hat, so folgt auf dieselbe Weise aus (17), 
dass g(™-'-) bei x =£ einen Pol der Ordnung (/+1)(4—1) hat. Dann wiirde 
aber in (18) die rechte Seite bei x =£ von der Ordnung m(h—1) unendlich 
werden und die linke Seite héchstens von der Ordnung (m—1)(—1). Also 
hat w als Funktion von x keinen von ©, «+i, —ui +i verschiedenen Pol. 

Hiatte w bei x= einen Pol ster Ordnung und ist g‘” in bezug auf x 
vom pten Grade, so erhielte man ganz analog, dass g‘"~—” bei x= © einen Pol 
der Ordnung /(h+1)+> hatte, in Widerspruch zu (18). 
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Wiirde ferner w bei x=y4+~i unendlich wie r~*, so folgte im Falle h>1, 
da V, und V, nur wie r~* unendlich werden, dass g‘"- bei x=u+yi wie 
r~(*—2) unendlich wird und dass 4>2 ist; wieder gegen (18). Ebenso folgt, 
dass w bei x = +i nicht stirker als unendlich werden kann. 

Dieselbe Untersuchung kann man fiir w als Funktion von y durchfiihren. 
Es muss daher w die Form 


C1 C3 
w=—+—+—+0 


r T1 


haben, wo c; héchstens quadratisch in x, y, ferner ce, cz; héchstens linear, ¢, 
constant ist. Nach (16) ist dann g‘"-» eine ganze Funktion in Q(z, y,r,7:), also 


gin) = byrry + bor + + ba, 


mit Polynomen j,, be, bs, bs in x, y, und ferner 


(19) z + = 


Cc 
(20) + = — 


(21) X(bsr1) 2 + = g™, 
1 


Fiihrt man statt x, y die Variabeln 
a= xy — yx, B = xk + yy 
ein, so hingt g‘” nicht von 6 ab, und vermége (19) ist 


( = rr, + 
Wire nun 2, nicht identisch 0, so wird dyrr; als Funktion von 8 mindestens 
wie 8? unendlich, also (b,7r:)3 mindestens wie 8, wahrend die rechte Seite von 
(22) beschrinkt bleibt. Also ist b:=0, c,=0. Ebenso folgt aus (20) und (21) 
zunichst, dass 6, und 0; constant sind, und dann 


— wk) = cog’™, b3(B + wit) = cag’™, 


also, da g‘™ von 8 frei ist, c.=0, c;=0. Damit ist bewiesen, dass w eine Con- 
stante ist. 

6. Mit Hilfe des Wiederkehrsatzes lasst sich nun leicht zeigen, dass die 
Constante w=0 ist. Aus (13) folgt nimlich durch Integration 


(23) g = 


C 
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wobei die Constante c von der Bahncurve abhingt. Betrachtet man die 
Umgebung eines Systemes x = Xo, y = Vo, X = Xo, ¥ = Yo, in dem g einen endlichen 
von 0 verschiedenen Wert hat, so gibt es Bahncurven, die nach einem beliebig 
grossen Zeitintervall nochmals in diese Umgebung eintreten. Ware nun w+0, 
so wiirde die rechte Seite von (23) fiir +0 den Grenzwert 0 oder haben. 
Folglich ist w=0. 

7. Zum Beweise des eingangs ausgesprochenen Satzes hat man nur noch 
zu zeigen, dass jedes Integral, welches ein Polynom in #, j mit Coefficienten 
aus Q(x, y, 7, 71) ist, sich auf ein Polynom der einzigen Variabeln x?+7?—2V 
reduciert. 

Ist g=g(+ --- +g die Zerlegung des Integrals g in homogene Be- 
standteile der Dimensionen m, - - - , 0 in x, 7, so gelten die Gleichungen (15), 
(16), (17), (18) mit w=0, also 
(24) igs” + 5g,” = 0, 

(25) tgs” + + — = 0, 
(26) + — + Vege + 0 
(k=1,--+,m-—1), 


(27) Vegs + Veg, = 0. 


Man fiihre wieder statt x, y die Variabeln 


a= xy — yx, B = xe + yy 
ein. Dann besagt (24), dass 
= P(x, ¥, a) 


ein Polynom der 3 Variabeln x, , a allein ist. An die Stelle von (25) tritt 


und folglich ist 
wo nicht von abhingt. Als Funktion von a, 6 betrachtet ist daher 
ein quadratisches Polynom in 
Nach (26) wird 


2 2 k—1 k k k k 
+5 ff + Bex — ags ) 


(29) (k+1) (k+1) (k+1) 
—Vilgs ) 


(k+1) (k+1) 


; 
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fiir k=1,---, m—1. Nach (27) gilt dies auch fiir k=0, wenn g~—” =0 defi- 
niert wird; und mit g-” =0 ist die Gleichung trivialerweise noch fiir k = —1 
richtig. Fiir das Folgende geniigt es, (29) fir k=m—1 und k=m—2 zu 
untersuchen. Beachtet man, dass 


(4? + = (a + my)? + (B + wit)? 
ist, so ergibt sich aus (29) fiir k=m-—1 durch partielle Integration die Glei- 
chung 
yPs— (x — w)(Py t+ 
a— py r 
a+ my 


= 


(30) 


+ + 82, 

wo g2 als Funktion von , y, a, 8 ein biquadratisches Polynom in 8 ist. Endlich 
liefert (29) fiir =m —2 bei Benutzung von (28) und (30) durch eine langere, 
aber ganz elementare Rechnung fiir g‘"-® die Beziehung 


r 


(31) 


+ — + “a8 + £3, 
1 


wo gs in Q(x, x, y, 7, 71) gelegen ist. 
8. Setzt man 


(32) — — = A, — + = B, 


so liegt zufolge (31) die Funktion 


d, d 
(33) 
r 


in x, y, 7, 71). Indem man einen Umlauf um den Punkt 
Bo = wt + i(a — wy) — + my) 


machen lisst, wobei r sein Vorzeichen andert, erkennt man, dass auch 


d d 
on 
1 


zu 2(%, ¥, x, y, 7, m1) gehdrt. Andererseits nehmen bei einem Umlauf um den 
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unendlich fernen Punkt die beiden Integrale je um 277 zu, wihrend die Funk- 
tioren (33) und (34) dabei ungeindert bleiben miissen. Also ist 
A+B=0, -A+B=0, 
A=0, B=0. 
9. Aus (32) und (35) folgen die beiden Differentialgleichungen 
jP;-%P,=0, P.=0. 


(35) 


Nach der zweiten hingt das Polynom P(x, ¥,a) nur von « und 9 ab, nach der 
ersten sogar nur von x?+y?. Als homogenes Polynom in #, y ist daher 


P = + 


mit constantem a und natiirlichem k. Da nun a(%?+y?—2V)* ebenfalls ein 
Integral ist, so ist die Funktion 


g — a(x? + y? — 2V)* 


ein Integral, das wieder ein Polynom in x, 9 mit Coefficienten aus Q(x, y, 7, r1) 
ist, und zwar von kleinerer Dimension in &, y als g selbst. 

Durch vollstindige Induktion ergibt sich daher, dass g ein Polynom der 
einzigen Variabeln «?+7?—2V ist. Damit ist der Beweis des zu Anfang 
ausgesprochenem Satzes beendet. 
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ON THE CHARACTERISTIC VALUES 
OF THE MATRIX f(A, B)* 


BY 
WILLIAM E. ROTH 


DerFinition 1. If A and B, two matrices of order n, have the characteristic 
values a; (i=1, 2,--- , a) and b; (i=1, 2,--- , m), respectively, then A and B 
have the property P if and only if every scalar polynomial f(A, B) in A and B 
has the characteristic values f(a:, b;) (é=1, 2,---, and the characteristic 
value a; of A is said to be associated with b; of B if f(ai, b;) is a characteristic 
value of every f(A, B). 

It is well known that pairs of commutative matrices have the property Pf 
and recently McCoyt{ showed that quasi-commutative matrices likewise have 
this property. In §1 we prove a general condition which is necessarily satisfied 
by a matrix pair having the property P; moreover we show how the charac- 
teristic values of two such matrices must be paired or associated. When A 
is assumed to be a Jordan canonical matrix, we display in §2 the form of B 
such that A and B have the property P. The results obtained include, as far 
as is known to the author, all known matrix pairs having the property P,§ 
and include as well those obtained by Bruton.|| 


1. NECESSARY CONDITIONS 
THEOREM I. If 


A =A: + 


* Presented to the Society, April 19, 1935; received by the editors February 25, 1935. 

After the present paper had been sent to the editors, I learned of Professor Williamson’s article, 
The simultaneous reduction of two matrices to triangle form, American Journal of Mathematics, vol. 57 
(April, 1935), pp. 281-293. Professor Williamson’s very interesting paper gives necessary and suffi- 
cient conditions for the existence of the property P in case one of the matrices, say A, is such that its 
characteristic equation is also its minimal equation, whereas the present article does not make this 
restriction on A and does not establish conditions which are both necessary and sufficient. 

+ Frobenius, Uber vertauschbare Matrizen, Berliner Sitzungsberichte, 1896, pp. 601-614. 

t McCoy, On quasi-commutative matrices, these Transactions, vol. 36 (1934), pp. 327-340. 

§ MacDuffee, The Theory of Matrices, Berlin, 1933, in §16 gives a concise résumé of known re- 
sults. 

|| Bruton, Certain aspects of the theory of equations of a pair of matrices. Thesis (1932), University 
of Wisconsin; abstract in the Bulletin of the American Mathematical Society, vol. 38 (1932), p. 633. 

{{ The dot sum representing A indicates that the blocks A; occur along the principal diagonal of 
A and that all remaining blocks are zero matrices. 
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where |A;—dI| =(a;—d)™, in case ixj, and B=(B;;), where Bi; 
(i, 7=1, 2,---, 7) are n;Xn; matrices, have the property P, then the deter- 
minant 


(1) | B) = 


and the characteristic values of B associated with a; of A are those of Bi, and 
moreover the matrices 


and 

Ag" Bas Ag Bay +++ A,’ByaAa’, 
where a, B,- ++, p are not all equal and o;,,=0, 1, 2,--- , are nilpotent 
matrices. 


The proof of this theorem follows. Let 


vid) = — 0)” 0)” 1,2,--+,”), 
that is, ¥,;(A) is the characteristic function of A from which the factors corre- 
sponding to A; are deleted. The matrix y,(A,) is non-singular, for by hy- 
pothesis a;a,, if i#7; on the other hand, y;(A,) =0, if 747. Let g;(A;) be the 
inverse of then the polynomial ¢;(A) =yig:(A)i(A) exists such that 

$:(Ai) = wili, 
;) = 0, i#j, 
where y; is an arbitrary parameter, and J; is a unit matrix of order n;. Obvi- 
ously ¢;(A) is the direct sum of zero matrices save the ith , which is u,J;. Then 
since A and B have the property P, the matrix ¢;(A) +B must have the char- 
acteristic value ¢;(a;) +8 =u:+8, where B is a characteristic value of B which 
is associated with a; of A. Hence 
| $:(A) + B— + 8)I| =0 
for all values of u;. We shall display this determinant for the case 7=1: 
Bu BI; Bis Bi, 
Boz — (ui + B)J2- Bar 


By — (ui + B)I, 


The determinant is a polynomial of degree m—m, in 41, whose constant term, 
A(0) =|_B—8I|, vanishes because is a characteristic value of B. Moreover, 


Br | 
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since A(u;) is identically zero under the hypothesis that A and B have the 
property P, we conclude that 


| Bu — | = 0, 


since this is the coefficient of 4*-". That is, A(u:) vanishes identically only if 
all characteristic values of By are also characteristic values of B. Hence the 
m, characteristic values of B associated with a; are those of By. Similarly the 
characteristic values of B;; (i=2, 3, - - - , r) are also characteristic values of 
B and are associated with A; respectively. We can therefore conclude that 
equation (1) holds and the first part of the theorem above is proved. 

To prove the final assertion of the theorem above, we build up further 
polynomials in A and B. Let 


C@ = ($:(A) + $2(A) + 
= + + --- 
then 


D® = CB BC” = — 


is again a polynomial in A and B. Since A and B have the property P, the 
characteristic values of D™ are all zeros. Moreover, A and D have the 
property P; hence, according to the demonstration already given, the char- 
acteristic values of D© associated with a; of A are those of 


= — BusA’). 
Since these are all zeros, the matrices 
Aj Bi — Bi =0,1,---,m — 1) 
must all be nilpotent. At most the first 2; powers of A; are linearly independ- 
ent, hence o need not exceed n;—1. 
Finally let 


(o,) 


(o,) (o,) ; (o,) ; 
= * + +--+ + *, 


(0) (0) (0) 


D, =C, B— BC; , 


where the wy (¢=1, 2,---, 7; R=1, 2,---, 8) are arbitrary parameters. 
Then the matrix 


(oy) (ox—1) (0) (on 41) 


is again a polynomial in A and B and is nilpotent, because the characteristic 
values of Di =((ui,—j)Bi;) are zeros. The characteristic values of the 
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matrix (2) associated with a, of A are those of an m_.Xmq matrix which is 
a linear combination of terms of the form 


with independent coefficients in (i=1, 2, ---,7;7=1,2,---,5), and that 
where yp =+ is identically zero since D® =0. Hence each of the terms above 
must be nilpotent, where not all a, 8, - - - , p are equal. This completes the 
proof of the theorem. 

We may note here that the foregoing implies that all matrices 


Boa 


are nilpotent provided not all subscripts a, 8, y, - - - , p are equal. If 


are nilpotent for (a, B=1, 2,---, 7; ¢2=0, 1, 2,--+, m2—1 and og=0, 1, 
2, +--+, %g—1), it is quite possible that A and B have the property P; though 
the writer has not succeeded in proving that such is the case nor that the con- 
ditions as stated in the theorem above are sufficient. 


2. SUFFICIENT CONDITIONS 


We shall now develop sufficient conditions that the matrices A and B 
have the property P. To do so we shall find it convenient to make certain 
definitions of terms and to prove two lemmas. 


DEFINITION 2. The matrix M =(mi;) of order pXq has p+q—1 diagonals. 
These we number consecutively beginning with that containing the single ele- 
ment m,,1. Then we say that the rth diagonal, O<Sr<p+4q, of M is starred if the 
first r—1 diagonals of M contain only zero elements and the remaining diagonals 
of M may or may not contain zero elements. 


That is to say, the diagonal to be starred of a pXgq zero matrix may be 
chosen in p+q+1 ways; that of a non-zero pXp diagonal matrix may be 
chosen in p ways. For example, the 3X2 matrix may have its diagonal to 
be starred chosen in one of the following six ways: 

0 0 0 0 

00, 00, 00, 
where an asterisk represents an element in a starred diagonal and both 
asterisks and dots represent arbitrary elements which may be zeros. In the 
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first we may regard the sixth diagonal as starred and in the last the zeroth 
diagonal as starred. 


DEFINITION 3. The matrix X =(X;;), where X;; (i, 7 =1,2,---, r)aren;Xn; 
matrices whose pith (O<pi;<ni+n;) diagonals respectively are starred, is an 
umbral matrix, if 
(a) pig + pis = + (i,7 =1,2,---,7), 
and if (b) in any row of X exactly s elements are starred then exactly s—1 other 
rows of X have the corresponding s elements starred. 


For example, the matrix 


oo 


where the Greek letters represent the location of starred elements in starred 
diagonals and dots represent the location of arbitrary elements, is an umbral 
matrix. According to the definition above, the umbral matrix X may be 
chosen in 


+ + 1) 
1 


ways, where n, =n; (i=1, 2, - - - ,r); for evidently the diagonals to be starred 
can be chosen arbitrarily only in the matrices X;, or in X;; (¢=1, 2,---,7); 
the remaining blocks have their starred diagonals determined by conditions 
(a) and (b) of the definition above. Determinants of umbral matrices, in 
which all blocks are square and the principal diagonal of each is starred, were 
studied by Williamson.t The direct product A<B>=(a,;B), where the 
principal diagonal of B is starred, is an umbral matrix of the same type. 


DEFINITION 4. Two umbral matrices X and Y are related if corresponding 
elements of X and Y are starred. 


t Williamson, Latent roots of a matrix of « special type, Bulletin of the American Mathematical 
Society, vol. 37 (1931), pp. 585-590. 


0 0 ea 0 a B 

U=|0 0 a:0 

000 0 0 0 6 
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Lemna I. If X is an umbral matrix then there exists an orthogonal matrix Q 
such that 


(3) QXQ? = ((Xii)), 


where Q? is the transpose of Q, and where the v; Xv; matrices X;; (i,j =1,2, - - - ,p) 
are zero if 1>j, contain only starred elements of X if i=j, and contain only 
arbitrary non-starred elements of X if i<j.¢ 


According to condition (a) of Definition 3 of an umbral matrix X =(x;;) 
(i, 7=1, 2,---, m), we conclude that all x,;; ({=1, 2,---, m) are starred; 
also that if x,, is starred, then x,, is likewise starred; and on the other hand if 
X,, is not starred, then either x; or x,, is a zero element. From these con- 
clusions and by condition (b) of the definition of an umbral matrix, it follows 
that if s rows have s starred elements each, then the corresponding s columns 
also have s starred elements each, and the s* intersections of these s rows and 
s columns locate the only starred elements in these rows and columns. More- 
over, these s rows have only zero elements in z columns and only non-starred 
elements in uw columns (i.e., z-+u=n-—s); and the corresponding s columns 
have only zero elements in u rows and only non-starred arbitrary elements in 
z rows. That is, by an interchange of rows and the corresponding interchange 
of columns of X, the sXs minor determined by them may be brought into 
the principal diagonal of the transformed matrix with an sXz zero matrix 
on its left and a “Xs zero matrix below it. The same may be done with all 
starred elements of X. Hence X may be transformed to a matrix having all 
starred elements and no other elements of X in non-overlapping square blocks 
along the principal diagonal and zero blocks below and to the left of the di- 
agonal blocks. 

The interchange of rows of a matrix can always be accomplished by multi- 
plying it on the left by a matrix, Q, having only one non-zero element, namely 
unity, in each row and column. The corresponding interchange of columns 
results if the matrix be multiplied on the right by Q7, the transpose of Q. 
Obviously QQ7 =. Hence the lemma is proved. 

The matrix Q, which transforms the umbral matrix, X, to the form estab- 
lished above plays an important role in the sequel and is not unique, if r>1. 
However, we may so choose Q that any s rows and corresponding s columns 
whose intersections determine an sXs block in the principal diagonal of 
QXQ? are not interchanged among themselves. That is, the s? elements in 
question have the same relative positions with respect to each other in the 


+ Hereafter the double parentheses will indicate a matrix in which only the blocks X;;, i>j, are 
necessarily zero matrices. 
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diagonal blocks of 0XQ7? that they enjoy in X. This fact will be useful later. 
On the basis of the above lemma we have at once the following 
Coro.iary. The determinant of an umbral matrix depends on and only on 
its starred elements. 
Lema II. If f(X, Y) is a polynomial in the related umbral matrices X 
and Y, then X, Y, and f(X, Y) are related umbral matrices. 


The matrix Q, which transforms X to the form (3), transforms Y to the 
corresponding matrix 


(4) QYQ? = ((Y;;)), 
where Y;; are v;Xv; matrices. But 
QOf(X, = f(QXQ", 


is a matrix, whose diagonal blocks are f(X;:, Y;:) (¢=1, 2, - - - , p) and those 
below and to the left of them are zero blocks. Hence the transform Q7(_ )Q 
transforms Qf(X, Y)Q7 to f(X, Y), which is therefore an umbral matrix 
related to X and to Y. 


Coro.tiary 1. If X and Y are related umbral matrices which are trans- 
formed by the orthogonal matrix Q to the forms (3) and (4) respectively, then the 
determinant 


| (X, ¥)| = | Yu) | 
and consequently depends only on the starred elements of X and Y. 


Coro.iary 2. If X is an umbral matrix and if all starred elements in and 
above or in and below its principal diagonal are zeros, then X is a nilpotent 
matrix. 


If all starred elements of X in and below its principal diagonal are zeros, 
then the orthogonal matrix Q can be so chosen that QXQ7 has only zero ele- 
ments in and below its principal diagonal, and is therefore nilpotent. If all 
starred elements of X in and above the principal diagonal are zeros, the same 
conclusion holds. 


Coro.iary 3. If X and Y are two matrices of order n, if R exists such that 
RXR- and RYR- are related umbral matrices and RX R- is in the Jordan 
canonical form, and if all starred elements of RY R-' in and above or in and be- 
low its principal diagonal are zeros, then |X+Y| =|X|. 


In the present case, where RXR“ is in the Jordan canonical form, Q 
exists such that the diagonal blocks of ORX R-'0Q7 are diagonal matrices hav- 
ing non-zero elements only in their principal diagonals since the only non- 
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zero starred elements are those in the principal diagonal of RXR-'. The 
diagonal blocks of ORY R-'Q7, under the hypotheses of the above corollary, 
have only zero elements in and below (or above) their principal diagonals, 
consequently | X¥+Y| =| X|. This corollary is a generalization of Frobenius’ 
theorem{ which requires X and Y to be commutative. 


THeEoreEM II. Jf A and B are given matrices of order n, if RAR“'=A 
and RBR-=B, where 


A=A,+42+---+4,, 
and 


are matrices of order n;, and if A and B are related umbral matrices, such that 
for every starred element by, of B=(b;;) we have 


(5) — a.) = 0, 


either for all h>k, or for all h<k, where we designate the hth principal diagonal 
element of A by an, (h=1, 2,--- , m), then A and B have the property P. 


This theorem says merely that if for example 


aio 

0a 0 6b 
00a 


and U is the matrix given above (page 238), then A and U are related umbral 
matrices and have the property P in case a=); if, however, a+b and if either 
the f’s or the y’s of U represent zero elements, then A and U have the prop- 
erty P. 

Let Q be the matrix which transforms the related umbral matrices A and 
B to the forms 


QAQT = (ii), 
= 
where the diagonal blocks and (¢=1, 2,---,) are of order 


y:=n. The matrices are diagonal matrices with certain permutations 


* 


(6) 


t Frobenius, Uber lineare Substitutionen und bilineare Formen, Crelle’s Journal, vol. 84 (1878), 
pp. 1-63. 


a, 1 0---0 
0 a 1---O0 
A; = (¢=1,2,---,r), 
0 0 
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of v; elements a; (j=1, 2,---, 7) in their principal diagonals. If all a;=a 
(j=1,2,---,7r), then all blocks %;;(i=1, 2, - - - , p) are scalar matrices and 
as such have the property P with the matrices B;; (i=1, 2,--- , p) respec- 
tively. Hence according to Lemmas I and II, the matrices A and B them- 
selves have the property P without further restrictions upon B. However, if 
the a; (j=1, 2, - - - , 7) are not all equal, then neither the sum nor the product 
of A%;; and %,; will necessarily have the property P; hence in order that the 
matrices A and B may have the property P further restrictions must be 
placed upon B. 

As was pointed out following the proof of Lemma I, we may so choose 0 
that starred elements of B do not change the relative positions in the blocks 
%;; that they enjoy in B; moreover, it is no restriction upon A to assume 
that in A all blocks A; (j=1, 2,---, 7) having the same characteristic 
value are adjacent. Hence with Q so chosen, like characteristic values a; will 
occur in adjacent positions along the principal diagonals of each block %;; 
(i=1, 2,---, p). Now according to hypothesis Dix (@an—aix) =0, we have 
bin if an, ~axx, and the same is true of the corresponding blocks %;; and 
%;;. For example, %;; and $;; under the hypothesis above must have the 
following forms: 


a, 0-0 0 0 (. 
0 a0 0 0 

Wi= |0 ag 0 -- <4], 
0 0 0 a 0 0 0 
00 0 0 a, 000 0 


where da, as, and a, are distinct and the dots of %;; represent arbitrary ele- 
ments. Obviously matrices of this form have the property P. Hence if 
bin(@xn—@xx) =0 for all h>k the matrices A and B and consequently A 
and B have the property P. The same conclusion holds if Dix (aan—axx) =O 
for all h<k. Herewith the theorem is proved. 

THeoreEM III. Jf Y =(Y;;), where Y;; (i, 7=1, 2,---,7) aren; Xn; matri- 
ces having zero elements in the first |n;, n;|—1 diagonals where |n;, n;| is the 
greater of the two numbers n; and n; or their common value, then the determinant 
of Y is dependent only on the diagonal elements of those blocks Y ;;, where n;=n;; 


that is, 
--- 


where V;.=(Yi;) (k=a, B, - ~~ ,d), andi, j run only over those values for which 
Y ;; are square matrices of order and Na, Mg, , My are the distinct values of 
n(i=1,2,---,7). 
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It is no restriction upon Y to assume that 1,.2>n.= --- 2n,. Now if we 
regard the uth diagonal of the blocks Y;; as starred, then Y is an umbral 
matrix, and by Lemma I and its corollary the determinant of Y depends only 
on the n;th diagonal elements of its blocks. However, the elements of the 
nth diagonal of Y;; are all zeros in case i >j and n;<m;. Hence by a procedure 
similar to that followed in the proof of Theorem II, we can show that the 
present theorem holds.T 

The matrices commutative or quasi-commutative with the Jordan canon- 
ical matrix A given in Theorem II are of the form Y here considered save 
that the blocks Y;; must be zero if a;#a; and the non-zero elements of a given 
diagonal of Y;; are not linearly independent in such cases. The above theorem 
makes no such restrictions upon the elements in and above the [n;, m;|th 
diagonals of Y;;. 

Added in proof, January 7, 1936. Another attack upon the problem here 
discussed was recently made by McCoy ‘Bulletin of the American Mathe- 
matical Society, vol. 41 (1935), p. 635, abstract 41-9-351). Professor McCoy 
has kindly communicated a more complete statement of his main results to 
me. They are very general and supplement rather than duplicate those we 
give above. 


T Williamson, Bulletin of the American Mathematical Society, vol. 37 (1931), pp. 585-590, 
proved a special case under the above theorem. 
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A CLASSIFICATION OF GENERATING FUNCTIONS* 


BY 
D. V. WIDDER 


Introduction. By a generating function we mean a function f(x) which can 
be represented by a Laplace-Stieltjes integral 


(1) (a) 


which converges for some value of x. Here a(é) is a function of bounded 
variation in the interval (0, R) for every positive R, and by convergence we 
mean that 


R 
lim e~**da(t) 
Road 


exists. For most purposes it is convenient to assume that a(¢) is “normalized” ; 
that is, 


a(t +) a(t —) (> 0), 


a(0) = 0. 


Such normalization has no effect on the function f(x). The representation of 
a function f(x) in the form (1), with a(¢) normalized, is unique. The function 
a(t) is called the “determining” function corresponding to the generating 
function f(x). In particular, if 


att) = f 


the integral (1) becomes 


(2) f(x) -f e~*'o(t)dt. 


Or, if a(t) is a step-function with its jumps at the points Ax, 


lim = ©, 


* Presented to the Society, December 29, 1934; received by the editors February 1, 1935, and, in 
revised form, June 27, 1935. 
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the integral (1) reduces to the Dirichlet series 


(3) f(x) = ane 

n=0 
which may, in particular, be a power series in e~*. Thus one sees the appro- 
priateness of the term generating function.* 

It is the purpose of the present paper to obtain characterizations of gen- 
erating functions f(x) which will insure that their corresponding determining 
functions shall have certain prescribed properties, such as continuity, ab- 
solute continuity, etc. We find that these characterizations are most con- 
veniently expressed in terms of an inversion operator of E. L. Post which was 
discussed in detail in an earlier paper.t For this operator we use the notation 


Lit[f(x)] = = (k =1,2,---). 


We have proved that for (2) 
lim = 


for almost all positive values of ¢. In the present paper we also show that for 
(1) 
a(t) — a(0 +) = lim f(x) ]du. 
kw 0 


We introduce here one further operator, 
LeeLf(x)] = t(20/k)? [f(x)], 
and we are able to show for the integral (1) that 
= +) — a(¢ 0), 
and for the series (3) that 
lim /(#)] = dy =1,2,---). 


* Originally, the generating function of a sequence {a,} was the function whose Maclaurin de- 
velopment had the constants a, for its coefficients. 

t In this and subsequent foot-notes we shall use the following abbreviations: I for D. V. Widder, 
A generalization of Dirichlet’s series and of Laplace’s integrals by means of a Stieltjes integral, t| ese 
Transactions, vol. 31 (1929), pp. 694-743. 

II for D. V. Widder, Necessary and sufficient conditions for the representation of a function as a 
Laplace integral, these Transactions, vol. 33 (1931), pp. 851-892. 

III for D. V. Widder, The inversion of the Laplace integral and the related moment problem, these 
Transactions, vol. 36 (1934), pp. 107-200. 

The present reference is to III, where a reference to E. L. Post will be found on p. 108. 
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This is a determination of the coefficients of a Dirichlet series which is analo- 
gous to the determination of the coefficients of a power series in terms of the 
derivatives of the sum of the series. 

We summarize the more important results of the paper in the following 
table: 


(I) a(t) is of bounded variation in (0, ~); 


J | Lee[f(x)]| dt M (k =1,2,---). 
(II) a(t) is non-decreasing in (0, ~); 
Lit[f(x)] = 0 (¢>0;%=1,2,---), 
f(x) 2 0 (x > 0). 
(IIT) |¢()| <M in (0, 
| LiwLf(x)]| M (¢>0;k =1,2,---), 
f(~) = 0. 
(IV) is of class L” (p>1) in (0, ©); 
fi M (k = 1,2,---), 
f(~) = 0. 


(V) is of class L in (0, 
l.icm. f(x) | exists, 
= 0. 
(VI) (2) is of bounded variation in (0, ©); 


(VII) $‘”(é) is of bounded variation in (0, ©); 


Li.[f(x)]| dt M (k=1,2,---), =0. 


J | M (k = 0, 1,2,---),f(~) =0. 
0 


(VIII) |a(é)| <M in (0, ~); 


ff (R>0;k = 1, 


aes (R>0;k =1,2,---). 
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(IX) a(é) is such that (1) converges for x >0; 


R 
f Lealf(x + ©) ]dt 


lA 


| 


+ oll ars (R>0;k =1,2,---), 


for every e>0. 
(X) a(é) is of bounded variation and continuous in (0, ©); 


f | ars at (k= i,2,--+}, 
f(~) =0 
lim 1,,.[f(x)] = 0 (t> 0). 


(XI) $(é) is completely monotonic in (0, ©); 
(-— 1)*[xf(x) >0 (x >0;n”,k = 0,1,2,---). 


(XII) a(é) is a step-function making (1) converge; 
same as (IX) and 


lim f(x) = 0 
kw 


uniformly in every sub-interval of an interval in which a(t) is to be constant. 


In each of these cases the condition on f(x) is merely abbreviated. For the 
exact statement the reader is referred to the corresponding theorem in the 
text. In each case the condition is necessary and sufficient. In (VI) and (VII) 
the upper index on the operator L indicates differentiation with respect to ?. 
In (V), l.i.m. means limit in the mean of order one, 


lim | LeeLf(x)] — Lis [f(x)]| dt = 0. 
lw 


We note that the result (II) is due to S. Bernstein, the present statement 
being new only in form. The proof given in the present paper is new. The 
results (I) and (III) were proved earlier in a different form and with a differ- 
ent method by the author. All remaining results are new. 

We call special attention to the results (IX), (XI), (XII). In (IX) we have 
for the first time a necessary and sufficient condition for the representation 
of a function f(x) in the most general convergent Laplace-Stieltjes integral. 
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In (XII) we have a characterization of functions which can be represented 
in the most general convergent Dirichlet series.* 
The result (XI) really has to do with the Stieltjes integral equation of 


the type dat) 
dalt 
fa) =f 


and gives for the first time a necessary and sufficient condition that it shall 
have a bounded non-decreasing solution a(é). 

Throughout the paper the determining function is thought of as real. It 
should be pointed out here, however, that most of the results of the paper 
also hold for complex determining functions. Thus all the results cited in 
the foregoing table except those under (II), (XI), and (XII) hold without 
any change whatever in the complex case. The proof in each case is made by 
breaking the complex function in question into real and imaginary parts. 
Thus if the function f(x+7y) is to have the form 


+ iy) = f + ias(t) | 


0 


where a;(t) +ia2(¢) is to-be of bounded variation in (0, ©) it is necessary and 
sufficient that 


| | avs at = 1,2,---). 
0 


To obtain this result from the corresponding real case one has only to use 
the relations 
f(x) = fil%) + ife(x), 
| 
| S| 
Lee[f(x)]| S| + | |. 


The author wishes to express his indebtedness to Professor J. D. Tamarkin, 
who read the original manuscript, for many valuable suggestions. In particu- 
lar, the proof of Lemma 1 of §12 is his. The author’s original proof was longer, 
less ingenious. 

1. Some preliminary results. In this section we will establish certain re- 
sults of a general nature which will be of fundamental importance for later 
parts of the paper. 


* Compare Th. Kaluza, Entwickelbarkeit von Funktionen in Dirichletsche Rethen, Mathematische 
Zeitschrift, vol. 28 (1928), p. 203, where Dirichlet series with positive coefficients are treated. Com- 
pare also II, p. 876. 
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THEOREM 1.1. Jf 


f(x) = f e~*"da(t), 
0 


where a(t) is a normalized function of bounded variation in 0<t<R for every 
positive R and the integral converges for some value of x, then 


a(t) = + lim f Liul f() ]du (0<t<). 
0 


To prove this we note first that the change of variable k = wv gives us 


(k — 1)! 


The operator S;,.[{(x)] defined earlier* is closely related to this latter in- 
tegral. In fact 


(- 


kit 


An integration by parts gives 


k — 1)! 
Hence 
k+1 k k 
But we proved in III that 
lim S;,.[f(«)] = (O<t<), 
kw 


Consequently our result will be established if we can prove that 
li o(=)(=) 0 (O<i<o) 
= 
oe k! f t t 


F(x) = f(x)/x. 


Set 


* III, p. 116. 


q 
249 
i 
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Then 


We proved earlier* that 


lim Ly,e[/F(x)] = a(t). 
kw 


It will now be sufficient to show that second term on the right-hand side 
of (1.1) approaches —a(é). Simple computation gives 


= e~ dv. 
Fa 
On the other hand 
— 1)* 
lim Ly-1,1[F(x)] = lim = a(t). 
kw 0 


If we replace a(v) by a(v)e-*/* in this equation we have 


lim f dv = a(t)e!. 
(k 1)! 0 
Hence 
lim 1)" FO-D (=) 
ko (k 1)! t t 
= im ( ) f e~**y*-la(tv)dv = a(t) (¢>0). 
kao \kR — 1 (k — 1)!Jo 


The theorem is then completely established. 
We turn next to 


* III, p. 119. 
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THEOREM 1.2. Let 


= 

0 
where a(t) is a normalized function of bounded variation in (0, R) for every 
R>O0, the integral converging for some value of x. If V(x) is the total variation 
of a(t) in the interval 0 St Sx, then 

R 
V(R) = | + tim f a (R > 0). 
—> 00 0 


Let S be an arbitrary number greater than R. Set 


Ss 
fila) = f 


= e~**da(t). 
Then 


R R R 
aes | dt + | | at 
0 0 0 


Simple computation shows that 


| Le,eLfi(x)] | < 


By Theorem 1.1 we have 


R Ss 
lim baal f e-™"dV(u) | dt = V(R) — V(O +). 
0 0 al 


ko 
On the other hand it may be seen that 
(1.2) fe(x)] = 0 


uniformly in 0<#<R. For 


fala) = f (0), 


B(t) = (S<t<o), 
B(t) = a(S) (O<tsS). 


251 
if 
f 
' 
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Hence by an earlier result* (1.2) is established. 
Consequently 


lim sup f | LieLf(x)]| dt < V(R) — VO +). 
ka 0 


But 
+) =| a(0 +)| =| f(~)|, 
so that 
R 
(1.3) | f(%)| + lim sup f | Le,e[f(~)]| dt < V(R). 
0 
Next set 


a,(t) = (O<t<o), 


Then by Theorem 1.1 
lim a,(t) = a(t) — = a(t) — a(0 +) (O<t<o). 


Let 
<t,=R. 
Then 
R 
(tis1) — | f | | dt. 
t=0 0 
Let k become infinite in this inequality: 
R 
| a(t:) — a(0 +)| + | — a(t) | S lim inf f | Le,eLf(x)] | at. 
i=1 —> 0 


The left-hand side of this inequality can be brought as close as desired to 
V(R)—V(0+) by a suitable choice of the number and position of the 
points ¢;. Hence 


R 
V(R) — V+) S lim inf | | Zi,[f(x)]| ae, 
0 


kw 


R 
V(R) lim int f | LeeLf(x)]| dt +| f(~)}. 


* III, Theorem 13, p. 137. An examination of the proof of Theorem 13 will show that the con- 
vergence is uniform since a(t) isconstant inOS/<R. 
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Combining this last inequality with (1.3) the theorem is proved. 
We now introduce the following 


DEFINITION. The operator is defined by the equation 


teal = (=) = (-) (=) 


(¢(>0,k>0). 


By means of this operator we shall be able to discuss the saltus of the 
determining function a(é). In fact we shall prove 


THEOREM 1.3. If the function f(x) has the representation 
(1.4) f(x) = f e-*"da(t), 
0 


where a(t) is of bounded variation in (0, R) for every positive R, the integral con- 
verging for some value of x, then 


lim ke [f(x)] = a(t +) — a(t -) (¢> 0). 


Let us suppose that the integral (1.4) converges for x >o.. Simple com- 
putations give 


Li [f(x)] = -(=) f e~ *ul ty (k > to,). 
0 
We first discuss the case ¢=1, 
half(x)] = *“u*da(u) (k > 
0 
Integration by parts gives 
half(*)] = = — f e~*uk-\(u — 1)a(u)du, 
0 
or, by Stirling’s formula, 
lkalf(x)] ~ bet — 1)a(u)du (k-@), 
0 
Set 
B(x) = a(1 —) (0S x<1), 
B(1) = a(1), 
B(x) = a(1 +) (l<x< oo), 


¢(x) = a(x) — B(x) (0S 


} 
| 
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Then 
healf(x)|~ ket — 1)B(u)du + ke* — 1)o(u)du 
0 
(ko), 
Since 
= (x > 0), 
1 
we have by successive differentiation 
= — k\du (x > 0), 
1 


ef — 1]du, 
1 


1 
ef — 1]du. 
0 
Hence it is clear that _ 
f — 1)B(u)du = a(1+) — a(1—). 
0 


To prove that 
(1.5) lim alf(x)] = +) — a(1 —) 


it will be sufficient to show that 


lim ket f — 1)o(u)du = 0. 
0 


ko 


But ¢() is continuous and has the value zero at « =1. Hence to an arbitrary 
positive ¢ there corresponds a positive 6 such that 


| o(u)| ({1— «| <8). 


By writing the integral (1.5) as the sum of three integrals over the intervals 
(0, 1—6), (1—6, 1+4), (1+6, ©) we easily arrive at the following inequality: 


f — 1)p(u)du 
0 


Mhe[e®(1 — + 2e + + 5) 
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Here 
| o(u)| (0<u<1), 


n is a positive integer greater than ¢., and 


N= | du. 
Since 
e(1—6) <1 (6 > 0), 
+5) <1 (6 > 0), 
the first and third terms on the right-hand side of the above inequality ap- 


proach zero as k becomes infinite, so that (1.5) is established. 
Returning to the general case we have by an obvious change of variable 


k! 


he [f(x)] = e~*“u*da(iu) = 
0 


where 
g(a) Blu) = alt). 
0 
Applying the previous result, we obtain 
lim lee Lf(x)] = lim = +) — B11 —) = a(t +) —a(t—) (¢>0). 


The theorem is thus completely established. 
Coro.iary 1. Under the conditions of the theorem 


lim (=) (-) ~~) (t > 0). 


ko wo 


This follows by an application of Stirling’s formula. 
Corottary 2. If f(x) can be expressed in terms of a Dirichlet series 


f(x) = ane?, 


n=0 


O (ima, = =), 


convergent for some value of x, then 


lim [f(«)] = an (n = 1, 2,---), 
ko 


255 
| 
: 
| 
| 
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2. Determining function of bounded variation. In this section we shall dis- 
cuss generating functions f(x) for which the corresponding determining func- 
tion is of bounded variation in the interval (0, ©). We introduce 


ConpiTiIon A. A function f(x) satisfies Condition A in the interval (0, ©) if 
(a) f(x) ts of class C® in the interval O<x<@, 
(b) aconstant M exists such that 


f | Lee[f(x)]| dt < M (k =1,2,---). 


By use of this condition we may state the fundamental result concern- 
ing functions of the class under discussion in 


THEOREM 2.1. Condition A is necessary and sufficient that f(x) can be ex- 
pressed in the form 


I(x) -f e~**da(t) (x > 0) 


where a(t) is of bounded variation in (0, ©). 


This result was proved earlier by the author* in a slightly different form. 
In the earlier form Condition A (b) was replaced by 


© tk 
f | dts M (k = 0,1,2,---). 
0 H 


The equivalence of these two forms follows immediately from an obvious 

change of the variable of integration. In a later section of the present paper 

we shall give a new proof of the theorem and a new form of the condition. 
We turn next to 


THEOREM 2.2. If 
= —*tda(t), 
f(a) 


where a(t) is a normalized function of bounded variation V in the interval (0, ©), 
then 


V= | f(~)| + lim f | Lee[f(x)]| de 
ke J 9 
(2.1) 


=| + tim f=] at. 
— 0 


* II, p. 866. 
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Moreover, the integral 
f | LeeLf(x)]| de 
0 


is a non-decreasing function of the integer k. 


This result is an extension of Theorem 1.2 to the case of an infinite in- 
terval and can be easily derived from that theorem. First suppose that 
f(2) =a(0+) =0. 

By Theorem 2.1 we have 


(2.2) f | | dt < M (k = 1,2,---). 
0 
Since 
R 
f ars a, 
0 0 
we have by Theorem 1.2 
V(R) S lim inf f | LeeLf(x)]| de < M, 
0 
whence 
V < lim inf f | Li,eLf(x)] | dt. 
ke 0 


On the other hand, 


lim sup f | Lse[f(x)]| dt < V. 
ko 0 
Hence 
tim f | Le Lf(x)]| dt = V. 
0 


If a(0+) is different from zero, we apply the result just established to the 
function f(x) —f(%) whose normalized determining function will have total 
variation equal to that of a(#) increased by | f(<)|. Hence (2.1) is completely 
established. 
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Finally, to prove the last assertion of the theorem we note that 


=— | 


s fi | du. 


These integrals clearly converge by virtue of (2.2). It follows that 


This concludes the proof. 
This result enables us to prove at once 


THEOREM 2.3. Let 
2.3 = -ztda(t), 


where a(t) is a normalized function of bounded variation V(R) in the interval 
O<t<R for every positive R, the integral converging absolutely for x>o.. Then 


e-*tdV(t) =| f(o)| + lim | > 0). 

For we have 
f(u+ x)= f e~“te-2"da(t) (u > 0) 

0 
= f e“dB(t), 
where 
0 


The total variation of @(¢) in the interval 0<it< © is 


f 
0 


an integral that converges for x>o, since (2.3) converges absolutely there. 
Now applying Theorem 2.2 to f(x+) considered as a function of « we have 


" 
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lim —| + u) | du = lim | fF | du 
0 k! z k! 


and the theorem is established. 
3. Determining function monotonic. The results of the previous section 
may be applied to completely monotonic functions. We recall the following 


DerFinitI0n. A function f(x) is completely monotonic in the interval c<x< 0 
if 
(— = 0 (c< 
f(x) is completely monotonic in the interval cSx< © if, in addition, it ap- 
proaches a limit as x approaches c from the right. 


We can easily show that if f(x) is completely monotonic in 0<x< © then 
Condition A is satisfied. It is a familiar fact that such a function is analytic 
in the interval 0<x< © so that A(a) is satisfied. Furthermore, 


Hence A(b) is also fulfilled. By Theorem 2.1 
(3.1) f(x) -f e~**da(t) (OS x<o), 


where a(#) is of bounded variation in the interval (0, ©). Appealing to 
Theorem 1.1 we have 


a(t) = + lim f (0<t<~), 


Since L;,u[f(x)] is non-negative, the function a(é) is non-decreasing. 
Conversely, any function f(x) of the form (3.1) with a(¢) non-decreasing 
and bounded is clearly completely monotonic in 0<x< 0. Hence we have 
proved 
THEOREM 3.1. A necessary and sufficient condition that f(x) can be expressed 
in the form 


f(x) -{ e~**da(t), 


where a(t) is non-decreasing and bounded, is that f(x) should be completely 
monotonic inO0Sx<o, 


i 
i 
I | 
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4, Determining function the integral of a function of class L?, p>1. In 
this section we discuss generating functions f(x) of the form 


(4.1) fla) = f 
0 


where the function ¢(é) is of class L” in the interval (0, ©); that is, the in- 
tegral 


f | (p > 1) 
0 


is finite. We first introduce 
ConpITIon B. A function f(x) satisfies Condition B, if and only if 
(a) it is of class C? in0<x<~, 
(b) aconstant M exists such that 


s at (k= i, 2,°°°), 


(c) 
lim f(x) = 0. 


We now prove 


THEOREM 4.1. A necessary and sufficient condition that f(x) can be repre- 
sented in the form 


f(x) -f e~*"h(t)dt 


with o(t) of class L” in the interval (0, ©) is that Condition B should hold. 


We note first that Condition B (b) is meaningless for k =0 since ¢(¢) need 
not belong to L. That is, (0) need not exist, as the following example shows: 


f(x) -f e~*'h(t)dt -f 


Here the function ¢(¢) belongs to the class L” for any p>1. But f(0)=. 

We prove first the necessity of Condition B. If f(x) has the representation 
(4.1) with $(é) of class L?(p>1), then that integral converges absolutely for 
x>0. For, by use of the Hélder inequality, 


leolas| f | (—+—=1,2>0), 


and the dominant integral converges by hypothesis. 


) 
i 
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Another application of the Hélder inequality gives 


k/ 


u*® k k+1 u* fk k+1 pia 
sf (=) | f (=) au | 
0 0 


This last factor is equal to unity, so that 


By use of the Fubini theorem we may interchange the order of integration 
in the iterated integral, since the resulting iterated integral, 


k+1 
f $(u) f (—) dt, 
0 0 


converges and has the value 


“| 6(w) 


If we denote the value of this integral by M we have Condition B (b). Con- 
ditions B (a) and B (c) are known consequences* of the representation (4.1) 


regardless of the class of ¢(é). 
We turn next to the sufficiency of the condition. Let ¢ be an arbitrary 


positive constant. Then 
(4.2) f | Lilf(x + | dt = (— :) (-) dt. 
0 k! 0 t t 


That the integral converges follows as a result of subsequent operations. First 
set 


Then (4.2) becomes 


1 kle 
kid 


k | k k2 


Again applying Hélder’s inequality we see that the integrals (4.2) and (4.3) 
are not greater than 


(4.3) 


* I, p. 702, and II, p. 864. 


| 
| 
k k 
—+en 
t u 
| 
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The second integral of the product can be evaluated and has the value 
k 


—q +1) 
But by B (b) 


so that (4.2) certainly converges and 


1 
L ds 
J + €) 


Hence the function f(x+) satisfies Condition A, and by Theorem 2.1 


f 


0 


the integral converging absolutely for x=0. By the uniqueness theoremf we 
see that 


ja) =f = 
0 0 
the integral converging absolutely for « >0. Here 
t 
a*(t) = f 
0 


If we set 


a(0)=0, a(t) =— 0), 


we have finally 


f(x) = J e*da(t), 


the determining function being now normalized. 
It remains to show that a(#) is the integral of a function of class L». Set 


I, p. 705. 


| 
) 
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f Luwlf(x) 


Then by Theorem 1.1 
lim = (OSt<). 


By Condition B (b) we see that there exists a subsequence of functions 
Li,u[f(x) (k ke, - - ) which converges weakly to a function ¢(w) of class 
L? in (0, ©). That is, 


lim Li;.uLf(«) W(u)du = f o(u)y(u)du 
joe 0 


for every function ¥(u) of class L¢ in (0, ©). By choosing ¥(u) as a step- 
function we have 


a(t) = lim = 


so that a(é) is the integral of a function which is of class ZL? in (0, ©). Our 
theorem is completely established. 
Our next result is 


THEOREM 4.2. If f(x) can be represented in the form 


(4.4) fa) =f 
with o(t) of class L®(p>1) in the interval 0 St< ©, then 


the integral on the left-hand side constantly increasing with k. 


We prove this last statement first. Make the change of variable t= in 
the above integral. Then 


Pp 


k 1 k k+1 
f(a) k+1 
0 k! x? 


Sincef f(«)=0, we have 


II, p. 864. 


i 

7 
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or 


(k + = (4) |du. 


Since 
f (e+ = 1, 


we have by Hélder’s inequality 
+ 


we | (1g) | Pde. 
u 


(4.6) | (k+1)x*f(x) |? < 


By Theorem 4.1 we know that f(x) satisfies Condition B. This fact enables 
us to conclude that the integral (4.6) converges. For, 


f= | < < =f - | 


the dominant integral converging by (4.5) and Condition B. 
Consequently we have 


k 
J [et 


g(x) + 2| |Pdu. 


If it is permissible to interchange the order of integration, this iterated in- 
tegral becomes 


(k+ 1) 


The change of variable uf =k-+1 in this last integral gives us the inequality 


leas f | ] 
0 0 


The above interchange in the order of integration was permissible since the 
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integrand was positive and since the resulting iterated integral is seen to con- 


verge. 
We now prove the first statement of the theorem. Since f(x) has the repre- 
sentation (4.4) we know* that 


tim | |” = | 6 |? 


for almost all values of ¢ in (0, ©). It follows by Fatou’s Lemma that 


ko 


f | o(u) < lim inf f | Lew Lf(x)] | 
0 0 
This inequality, combined with the inequality 
0 0 


proved in Theorem 4.1, establishes our result. 
5. Determining function the integral of a bounded function. By letting p 
become infinite in the results of the previous section we are led to introduce 


ConpiTIon C. A function f(x) satisfies Condition C if and only if 
(a) itis of class in0<x<om, 
(b) aconstant M exists such that 


| Lee[f(~)]| M (k=1,2,---;0S5t< 


(c) 
lim f(x) = 0. 


We can now prove 


THEOREM 5.1. A necessary and sufficient condition that f(x) can be repre- 
sented in the form 


fla) = 


where 
<M (0<St<»), 


is that it should satisfy Condition C. 


If f(x) has the above representation, it is clear that it is of class C® in 
0<x<o and that f(«) =0. Condition C (b) follows from the inequality 


* III, p. 122. 


| 
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| LieLf(x)] | Ss ML,,.[1/x] = M (k = 1,2,---). 


For the sufficiency let ¢ be an arbitrary positive constant. Then by use 
of the same transformation as that applied to (4.2) we have 


"| + )]| at J (1 


(1 du = —- 
0 k € 


Hence, as in §4, we have 
I(x) = f e~*'da(t) 
0 


where a(é) is a normalized function of bounded variation in (0, ©). Now de- 
fining a;(#) as in §4 we have by Theorem 1.1 


lim = a(t) (OSt<o), 


To show that a(t) is the integral of a bounded function it is sufficient to 
show that its difference quotient is bounded.* 


But 

te) — alt te) — t 1 4 
f | Lewlf(x)]| du < M 

ty tle — ty le — by ty 


< 


Hence Theorem II is completely established. 

6. Determining function the integral of a function of class L. Theorem 4.1 
was proved only for the case in which # is greater than unity. We can easily 
see that it is no longer valid if p is equal to unity. For, in that case Condition 
B reduces to Condition A. But a generating function satisfying the latter 
condition gives rise to a determining function which is merely of bounded 
variation and hence not necessarily the integral of any function. For the dis- 
cussion of the case when # is equal to unity we introduce 


Conp1TI0n D. A function f(x) satisfies Condition D if and only if 

(a) it is of class C* in0<x<om, 

(b) the function Ly,.[ f(x) ] converges in the mean of order unity as k becomes 
infinite: 


* E. W. Hobson, The Theory of Functions of a Real Variable and the Theory of Fourier’s Series, 
2d edition, vol. 1, 1921, p. 549. 


| 
| 
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lim | LetLf(x)] — Lis[f(x)]| dt = 0, 
law 


(c) 
lim f(x) = 0. 


The proof of the main result of this section depends on the following 


Lema. [f the function (t) is of class L in the interval (0, ©), then the func- 
tion 


(6.1) su) = f | ou) at 


is continuous for u=1 and a constant A exists such that 
(6.2) | g(u)| < Auw-t+A (0<u<o), 


To prove this result set =e? and «=e in the integral (6.1). It thus be- 
comes 


go) = (er) ends. 


The transformation shows that the function ¥(x) =¢(e*)e? is integrable in 
(—, Furthermore, 


| g(e”)| f {| + — + | — Y(x)| dx 


~ f | 9) de f | W(x) | dx. 


By appealing to a known result* we see that the right-hand side of this 
inequality approaches zero as y approaches zero. That is, 


lim g(u) = 0. 
ul 


If we set 
A | du, 
the inequality (6.2) follows at once since 
| | sf | ar f | dt| = A(1 + u-). 


* See, for example, N. Wiener, The Fourier Integral and Certain of its Applications, p. 14. 
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We are now able to prove 


THEOREM 6.1. Condition D is necessary and sufficient that a function f(x) 
can be represented in the form 


fla) 


where p(t) is of class L in the interval (0, ©). 
Suppose that f(x) has the above representation. Then simple computation 
shows that 
k k+1 u* 
| — | f — | du 
0 
Rett 


k! 


IIA 


f | o(ut) — | du. 


Hence 


k+1 


k 
f f | p(tu) — | du 


if the iterated integral converges. We see that it does by an interchange in the 
order of integration. If we define g(u) as in the Lemma it may be seen by use of 
(6.2) that the resulting iterated integral converges for k>0. It follows that 


Rett 


k! 


f | — o@| dt f 


Corresponding to an arbitrary positive constant e« we now determine a posi- 
tive number 6 less than unity such that 


| g(u)| < «/3 (|u—1| <8). 
We then have 


f e~*“u*g(u)du << — f e~*“u*du = —- 
k! Jig 3k! Jo 3 


Furthermore, since e~“u is an increasing function of # in the interval (0, 1), 


Rett 1—6 Rett 1-65 € 

f e~*“u*g(u)du < — 6) e“A(u + 1)du < — 

k! Jo k! 0 3 

(k > ko). 

The last inequality holds for kp sufficiently large since 

k+} 

lim — = 0 (0<6< 1). 

ko 
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In a similar way we have 
k+1 


k! 


k! 


k agi € 
f < e~ 4+ aes f e“A(u + 1)du < — 
146 146 3 


(k > ki) 


for k; sufficiently large. Combining these results it follows that 


di<e (k > ko, k > ki). 
Hence the necessity of Condition D is established. 
To prove the sufficiency we note first that the convergence in the mean 


of Li,.[f(x)] as k becomes infinite implies the existence of a function y(é) 
of class L in (0, ©) such that 


(6.3) tim f | LeeLf(x)] — v(t) | dt = 0 
0 
and 
Hence 
f f ly@|a@+1=M (k = ko) 
0 0 


for some integer kp sufficiently large. Hence Condition A is satisfied by f(x), 
and by Theorem 2.1 we see that 


ja) =f 


where a(é) is a normalized function of bounded variation in the interval 
(0, ©). It remains to show that a(¢) is an integral. Set 


= f [f(x) ]du 
0 


Then by Theorem 1.1 
lim = a(t) (OSt<). 


kw 


But it is known that if the sequence L;,u[f(x) ] converges in mean to ¥(u) then 


eS 
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lim Jadu = f 
0 0 


at) = f 


fis) = f 


with of class in (0, ©). 
1. Jf 


fa) =f 


where o(t) belongs to the class L in (0, ©), then 
lim f | Le,eLf(x)]| dt = f | (2) | de. 


This is a known result of the mean convergence of the sequence Lx,.[ f(x) ] 
to p(t). 
2. If 
l.icm. Ly[f(x)] = 
then 
lim = 


almost everywhere. 


This follows at once from an earlier result* of the author. 
We can also prove 


CoROLLARY 3. A necessary and sufficient condition that 
(6.4) fa) =f 
0 


where $(t) is of class L in the interval (0, R) for every R>O, the integral converg- 
ing absolutely for x>0, is that f(x+e«) should satisfy Condition D for every 
positive €. 


* III, p. 122. 
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For, suppose that f(x) has the representation (6.4). Then the integral 


f e~*| o(t) | de 
0 


converges and e-*¢(#) is of class L in (0, ©). Hence f(x+e) has the repre- 
sentation of Theorem 6.1 with ¢(#) of that theorem replaced by the function 
e~*'¢ (2). It follows that f(x+) satisfies Condition D. 

Conversely, if f(x+-«) satisfies Condition D we have 


(6.5) f(x +6 -f 


0 


where ¢,(#) belongs to the class L in (0, ©) for each e>0. Then 


6.6 = dt, 
(6.6) f(x) e~*'p(t)dt 
o(t) = 


The uniqueness theorem shows that ¢(#) is independent of e. It is clear from 
the definition of ¢(#) that it belongs to the class Z in (0, R) for each R>O. 
By Theorem 6.1, (6.5) converges absolutely for «=0. Hence (6.6) converges 
absolutely for x=e for each e>0, that is, for x>0. 

This corollary enables us to treat such familiar integrals as 


+ 1) = 
0 


f cos tdt, 
0 


= f e~** sin ¢dt. 
x? +1 0 


Coro.iary 4. A necessary and sufficient condition that 


fix) = f 


where (t) is absolutely continuous and of bounded variation in (0, ~) and 
vanishes at t=0, is that the function xf(x) should satisfy Condition D. 


The proof may easily be supplied. 

7. The general Laplace-Stieltjes integral. In this section we shall ob- 
tain necessary and sufficient conditions for the representation of a function 
f(x) in the form 


| 
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f(x) = J “e*dall), 


where a(é) is of bounded variation in 0<¢t<R for every R>0, the integral 
converging for x sufficiently large. It will be sufficient to restrict attention 
to convergence for x >0, since a suitable change of variable reduces the gen- 
eral problem to this case. We first introduce 


ConpiTI0on E. A function f(x) satisfies Condition E if and only if 
(a) it is of class 
(b) a constant M exists such that 


R 
f (R>0,k =1,2,---), 
0 
(c) a@ positive function N(t), defined for t>0, exists such that 
R 
f | N(R) (R>O0;k =1,2,---). 
0 


We can now prove 


THEOREM 7.1. Condition E is necessary and sufficient that f(x) can be repre- 
sented in the form , 


fla) = f 


where a(t) is a normalized function of bounded variation in 0<t<R for every 
positive R and is bounded in0St<o., 


In the definition of Condition E it is to be understood that for each posi- 
tive value of ¢, N(¢) is defined as a finite number. To prove the necessity of 
the condition we note first that if f(x) has the representation described in the 
theorem, we may also write 


f(x) = e~*'a(t)dt (x > 0). 


Moreover, since a(#) is of bounded variation in the neighborhood of t=0, we 


have 
t 


lim a(u)du = a(0 +). 


Hence we are in a position to apply a known result* to be assured of the 
existence of a constant M such that 


* III, Theorem 21, p. 152. 


272 
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(7.1) J <M (x >0;k =1,2,---). 


By the change of variable u=k/t this becomes 


Since x was an arbitrary positive number in (7.1) we have 
R 
f Lit[f(x) ]dt| M (R>0,k = 1,2,---) 
0 


for an arbitrary R. Hence f(x) satisfies Condition E (b). That it satisfies Con- 
dition E (a) is obvious. To show that it also satisfies Condition E (c) we make 
use of Theorem 1.2. Assuming a(#) normalized we have 


R 
tim ff | ae = VR) |]. 
—> 0 
It follows that for each positive R we can determine a number N(R) such that 


R 
f | Le,e[f(x)]| dt N(R) 
0 


so that the necessity of Condition E is completely established. 
To prove the sufficiency we note first that Condition E (b) implies* the 
existence of a function a(¢) bounded in (0, ©) and such that 


1 t 
lim— | a(u)du 
too 0 


exists, and that 
f(x) = e~*'a(t)dt (x > 0). 
0 


It remains to show that a(é) is of bounded variation in (0, R) for every posi- 
tive R to be assured that 


(7.2) f(x) -f e~*'da(t) (x > 0). 


To show this we recall} first that 


* III, Theorem 21, p. 152. 
{ III, Theorem 4, p. 122. 


| 
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lim Lae = aff) 


for almost all values of ¢ in (0, ©). Set 


a(t) = Let[f(x)/x] 
a;,(0) = 0. 


lim a,(t) = a(t) 

ko 
when # lies in a set of points E of the interval (0, ©) whose complement with 
respect to that interval is of measure zero. But we are at once able to con- 
clude* that 


By the change of variable u=(k+1)/v this equation becomes 


(k+1)t/k 
= + f(x) leo, 
0 
whence 
R 2R 
f | daxtt)| + f | do < | f(%)| + N(2R). 


That is, the set of functions a;(é) is of uniformly bounded variation in the 
interval (0, R). As k becomes infinite a,(#) approaches a(#) on the set E, so 
that a(t) is of bounded variation on that part of E which lies in the interval 
(0, R). By a resultt from the theory of functions of a real variable we con- 
clude that a(é) coincides on E with a function a(¢) which is of bounded varia- 
tion on (0, R) for every positive R. Hence, if we redefine a(¢) on the set com- 
plementary to £ as a(#), a process which may be carried out without chang- 
ing the value of f(x), it becomes a function of bounded variation on (0, R) for 
every positive R. In particular we may define a(#) so as to be normalized. 
Then 


f = e~*Rq(R) + 


* III, Theorem 14, Corollary 1, p. 140. 
t See, for example, S. Saks, Théorie de l’Intégrale, Warsaw, 1933, p. 149, Theorem 1. 
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Now let R become infinite. Since a(#) is bounded in (0, ) 


lim e-7®a(R) = 0 (x > 0). 


R--o 


Hence the integral (7.2) converges for x >0 and our theorem is proved. 
This theorem fails of complete generality in that the function a(é) is as- 
sumed uniformly bounded in (0, ©). Complete generality is attained in 


THEOREM 7.2. A necessary and sufficient condition that f(x) should have the 
representation 


(7.3) fa) =f 
0 


where a(t) is of bounded variation in (0, R) for every positive R, the integral 
converging for x >0, is that for each positive ¢ the function f(x+-€) should satisfy 
Condition E. 


For, if f(x) has the representation (7.3) then 


(7.4) f = f e"da(u). 
0 0 


But, since (7.3) converges for x>0, to each positive ¢ corresponds* a con- 
stant K, such that 


| a(t)| < 


By means of this inequality it is easily shown that 6,(¢) is uniformly 
bounded in (0, ©). Hence Theorem 7.1 is applicable and f(x+e) satisfies Con- 
dition E. Conversely, if f(x+-e) satisfies E, (7.4) holds with 8,(¢) bounded. 
That is, 


(7.3) f(x) = f e-*tda(t) (x >), 


a(t) = f 


By the uniqueness theorem a(#) is independent of e, and, since ¢ is arbitrary, 
the integral (7.3) converges for x >0. 

We can give Theorem 7.2 a different form which may be more useful in 
the application of the condition. We state the result in the 


* I, Lemma 2 of §3, p. 703. 
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CoroLiary. A necessary and sufficient condition that f(x) should have the 
representation 


f(x) = f 


the integral converging for x>0, is that for each positive ¢ there should exist a 
constant M, and a function N defined for 0<t< ©, such that 


R te k+1 


R 
J, 
The proof of this follows from Theorem 7.2 by a change of variable. If 


k k 
—+e=-—»> 
t 


te 
als(a) <) a < NR) (R2=1,0<R< k/e). 


we have 
R kR/(k+eR) ue\ 
J + = f du, 
0 0 


and, since R is an arbitrary positive constant, 


kR 


< 
k+eR € 


0< 


We are now able to improve on the generality of Corollary 3 of Theorem 
6.1. It failed of complete generality in that it had to do with absolutely con- 
vergent integrals. The general case is treated in 

THEOREM 7.3. A necessary and sufficient condition that f(x) should have the 
re presentation 


fla) = f 


where p(t) is a function of class L in (0, R) for every positive R, is that for every 
positive ¢ the function f(x+) should satisfy Condition E and that the limit 


lim [ f(x) |du 


kw 


which will then exist for every positive t, should be absolutely continuous. 


| 
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The result follows at once from Theorem 7.2, and the proof is omitted. 

8. Determining function possessing derivatives. In this section we are 
able to discuss the case in which the determining function a(#) has a deriva- 
tive of order which is itself of bounded variation in the interval (0, 2). 
A necessary and sufficient condition on the generating function will be ob- 


tained. As a preliminary to the main result we prove the 
Lemna. If f(x) is of class C"*' in the interval (0, ©), and if 


u” 
f —| (u) | du 
0 n! 
converges, then the constants Ay, A;,--- , A, in the function 
2 4” 


F(a) = f(z) do Aye on 


can be so determined that the integrals 


f alae | du (k = 0,1,--- 
0 


converge. 
If the constants A; are arbitrary, 
= 


so that the integral 
f | du 

converges by hypothesis. Also we have 

FO (x) = f(x) — Ay. 
But 

f | FO+0(4) | du < f u"| FO+(u)| du, 
1 1 

so that the integral 


f (u)du 
1 


converges absolutely, and 
lim F™ (x) = lim f(x) — A, 
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exists regardless of the manner in which A, is determined. We now determine 


it so that 
A, = lim f(x). 


Then 


lim F(x) = 0. 
Under these circumstances we have 


= -f (u)du, 


= fi (x) | 
0 n! 


We will now show that the constants A; can be determined by the recursion 
formula 


A,x*-* ] 
= — cco 
(n—k)! (n—k-—1)! 

(h = @,1,---, 2). 


(8.1) A, = lim lize 


Assume that we have proved the existence of the above limits for k=n, 
n—1,---,p+1, and that we have shown the convergence of the integrals 


f (a) | de (k=n,n—1,---,). 
0 ! 


Then the previous argument shows that 


lim F‘?)(x) = lim A, A p41% 
exists regardless of the manner in which Ao, Ai, ---, A» are determined. 
Here the limit (8.1) exists for k = p, and if we define A , by (8.1) with k= p we 


have 
lim F‘)(x) = 0. 


Then applying the earlier argument we see that 


f 


| 
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also converges. We are now in a position to complete the proof of the lemma 
by induction. 
As an immediate consequence we see that the limits 


lim 


exist under the hypothesis of the lemma. 
We make use of this Lemma to prove 


THEOREM 8.1. A necessary and sufficient condition that f(x) can be repre- 
sented in the form 


x 


n k bed 
+f e~**da(t), 


where Ay, Ai1,---,Anare constants and a(t) is of bounded variation in (0, ©), 
is that 


uk 
(8.2) f | du (k=n,n+1,---). 
0 H 


To prove the necessity of the condition apply Theorem 2.1 to the function 
n ak 
(8.3) F(x) = f(x) — 
Since 
(x) = (x) (k=n,n+1,---), 


the necessity of (8.2) is established. 
For the sufficiency we first determine a function F(x) in the form (8.3) de- 
termining the constants A; as in the lemma in such a way that the integrals 


u* 
(8.4) f —| FD (u) | du 
o (CR! 


all converge. If we choose M’ greater than M and greater than each of the 
integrals (8.4) we have 


f —| | du < M’ 
o 
Hence, by another application of Theorem 2.1, we have 


F(x) = 


(k = 0,1,2,--+,m) 

| 
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where a(é) is of bounded variation in the interval (0, ©). The result leads 
at once to the 


CoROLLARY. A necessary and sufficient condition that f(x) can be repre- 


sented in the form 
n xk 
f(x) = + f e-*da(t), 
: 0 


where a(t) is of bounded variation in the interval (0, ©), is that 
| aes (k=n+1,n+2,---). 
0 


We come now to the principal result of the present section. For conven- 
ience in stating the result let us extend the definition of L;,.[f(x) ] to the case 
k=0 in such a way that 


f | f [ery] | — ae. 
0 0 nN: 


di™tt dt™*1 


Then the result may be stated as follows: 


THEOREM 8.2. A necessary and sufficient condition that f(x) can be repre- 
sented in the form 


(8.5) -{ e~*'g(t)dt, 


where (t) has a derivative of order n which is of bounded variation in the in- 
terval (0, ©), is that there should exist a constant M such that 


To prove the necessity let f(x) have the representation (8.5). Then in- 
tegration by parts gives 


af(x) = (0) +f e~**p’(t)dt, 


provided 
lim = 
lo 
But since ¢‘”(¢) is of bounded variation in (0, ©) the integral 


H(x) -f (t) 


| 
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converges for x=0 and has the value 


A(x) = — (0) + (t) dt (x > 0). 


Another integration by parts gives 


H(x) = — xp (0) + of (t) dt (x > 0). 
0 


That 
(8.7) lim e~**g("-) (4) = 0 (x > 0) 


follows from the fact that ¢‘”(é) is bounded. For, since 
o™(t) = O(1) (t— 
then 
t 
= J o™(u)du + (0) = 
0 


and this result is sufficient to insure (8.7). In a similar way we see that 

(t) =O"-*) (t>0,k=0,1,---,n), 
so that 

lim e~*'¢)(#) = 0 


tro 


= Le Os ++ am 
i=0 0 


Hence if 


n 


F(x) = atlf(x) — o(0) 


t=0 


we see that F(x) must satisfy Condition A. Consequently a constant M’ 
exists such that 


u*® 
f | (u) | du (k =0,1,2,---). 
0 
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For k=n this becomes 


(8.8) fi ] | ae < M’ (k=n,n+1,---). 


Now consider the integral 


Sm ] | dt 


By definition of the operator L;,:[f(x)] it has the value 


bed untk 
0 
(n + k)! 
im 
kw kik” 


we see by (8.8) for k=u+1, n+2, - - - that there exists a constant M’’ such 
that 


” 
aes 
By a previous result* this becomes 


gen dt=M 
Setting k =n in (8.8) and making use of the convention established concern- 
ing Lo,+[f(x) ] we see that for a suitable constant M 


qnti 
The necessity of the condition is thus established. 

Conversely, if inequalities (8.6) hold, then inequalities (8.8) are also satis- 
fied. By Theorem 8.1 constants Ao, A:,---, A, and a function a(é) of 
bounded variation in the interval (0, ©) exist such that 
x 


= +f e~*da(t). 


No loss of generality results from assuming a(0) = Ao. 
Integration by parts gives 
* III, p. 135, Lemma. 
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orld) = Ar + 
0 


A t 


An 
o(t) = ¢n(t) = + on—1(u)du. 


Then successive integration by parts gives 


A,x* 
3 0 


n! 


= 


0 


But ¢(2) is of class C*—! in (0, ©). In fact it has an mth derivative 
o™(t) = a(t) 


which is of bounded variation in (0, ©), so that the sufficiency of our condi- 
tion is also established. 

9. Continuous determining function. By use of the operator /,,.[f(x) ] de- 
fined in §1 we now deduce 


THEOREM 9.1. If f(x) has the representation 


ja) =f 
0 


where a(t) is a normalized function of bounded variation in (0, R) for every posi- 
tive R, the integral converging for some x, then a necessary and sufficient condition 
that a(t) should be continuous for t>0 is that 


(9.1) lim h.Lf(x)] = 0 (t > 0). 


The proof follows at once from Theorem 1.3. We also point out that (9.1) 
may be replaced by 


(et)*f (kt) = 0 (t> 0). 
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Corottary. Under the conditions of the theorem a(t) is continuous for 
O<t<~ if and only if 


lim (et)*f((kt) = 0 0), 
kw 


lim f(x) = 0. 


ko 
This follows since 
a(0 +) = f(~). 
As an example of the type of result that can now be proved we state 


THEOREM 9.2. A necessary and sufficient condition that f(x) can be repre- 
sented in the form 


f(x) -f e~**da(t) (a(0) = 0), 


where a(t) is of bounded variation in (0, ©) and is continuous in 0St<~, is 
that Condition A should be satisfied and that 


lim f(x) ] = 0 0), 


lim f(x) = 0. 
10. The Stieltjes integral equation. In this section we shall discuss the 

class of generating functions whose determining functions have first deriva- 
tives which are completely monotonic. Since a completely monotonic func- 
tion is a Laplace integral we are treating generating functions whose deter- 
mining functions are themselves generating functions. We have shown that 
such functions f(x) can be represented in the form 


da(t) 
f(x) = J 


and such an equation is known as an integral equation of Stieltjes. We shall 
obtain a necessary and sufficient condition that it shall have a non-decreasing 
solution a(#). The result is contained in 


THEOREM 10.1. A necessary and sufficient condition that f(x) can be ex- 
pressed in the form 


da 
(10.1) f(x) 


' 
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where a(t) is uniformly bounded and non-decreasing in (0, ©), is that 
(10.2) (— 1)*[xf(x)]@+» 0 (x >0;n,k =0,1,2,---), 


(10.3) lim «f(x) exists. 

To prove the necessity of the condition we note first that if a(é) is uni- 
formly bounded and non-decreasing, then the integral (10.1) converges for 
x>0. For such x we have 


o" n 


t"da(t) 
nt f (x + 


(— 1)*[x*f(x) -{ (n,k = 0,1 2,+++) 
o (x + 4 

Differentiation under the integral sign is easily justified for x >0. One sees by 
inspection of this integral that it is non-negative for positive x so that (10.2) 
is established. To establish (10.3) we have 

| R 

date | s a(R) + a(R)] 

0 x a R 


for every positive R. From this it is clear that 


lim xf(x) = f da(t) = a(). 


Conversely, we see that (10.2) implies that for each fixed non-negative 
integer m the function [x*f(x) ]‘” is completely monotonic for x>0. Hence 
by Theorem 3.1 


(10.4) [x"f(x)]™ = f e~*'dB,,(t) (x >0;” =0,1,2,---), 


0 


where each £,(#) is a normalized non-decreasing function and the integral 
converges for x >0. Since 


= + (m + 1)[x*f(x)]™, 
we have by differentiating (10.4) 
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J 


This becomes, after integration by parts, 


e~*"tdB,(t) + (n + 1) f e~*'dB,,(t). 
0 


The uniqueness theorem for Laplace* integrals now shows that 
= — + (m+ 1) 
or, again integrating by parts, that 
= (n+ 2) f — 


This equation shows at once that all of the 8,(¢) are continuous for ¢>0. 
Hence the equation may be differentiated, showing that all 8, (¢) exist and 
are continuous for />0. Thus, 

(t) = + 1)Bn(t) — Bnsrlt). 
Since all 8,(#) are now known to be of class C’, we see that the 8,’ (#) exist 
and are continuous for />0 and that 
(10.5) (t) = (t) — Bn+i(t). 


Thus, we can show by induction that the 6,(¢) are of class C* for >0. 
We now rewrite equation (10.5) as follows: 


d 
= — (é)/t"] (¢ > 0). 


Repeated application of this formula to itself yields 


Bn+1(t) 
@] (n = 0, 1, 2,---). 
Since all 8,(#) are known to be non-decreasing, the left-hand side is non- 
negative for positive ¢. Hence 87 (¢) is completely monotonic for ¢>0. A 
further application of Theorem 3.1 shows that 


Bi (t) = f (> 0), 
0 
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where a(uz) is non-decreasing and the integral converges for ¢>0. 
We show next that a(x) is bounded in (0, ©). By (10.3) we see that 


6o(0 +) = lim f(x) = 0, 


so that B,(é) is continuous in the interval 0<t< ©. Moreover, since Bo(#) is 
non-decreasing, a familiar Tauberian theorem* is applicable, and we have 
t 
(10.6) xf(x). 
0+ 
Now 8; (é), being completely monotonic, is a decreasing function of ¢ in 
(0, ©). Then (10.6) shows that 


= = tim 


0+ 


Hence a(uz) is bounded in (0, ©). That is, 


0 0 


da(u) 
«tu 
This completes the proof of the theorem. 


CoroLiary. Condition (10.2) may be replaced by 


(10.7) (—1) 20 (¢>0;k =1,2,---;”=0,1,2,---), 
= 0 (¢>0;" =0, 1, 2,°°*). 
The upper index (m) in (10.7) indicates the mth derivative of Li,+[f(x)] 


with respect to ¢. The equivalence of this condition with (10.2) is established 
by use of the identity 


(n) 
This form of the theorem suggests the following result: 
* See, for example, J. Karamata, Neuer Beweis und Verallgemeinerung der Tauberschen Sdize, 
welche die Laplacesche und Stieltjessche Transformation betreffen, Journal fiir die Reine und Ange- 


wandte Mathematik, vol. 164 (1931), p. 29. 
+ III, p. 135. 
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THEOREM 10.2. If f(x) has the representation 


dalt 
fa) = f 


where a(t) is bounded and non-decreasing in the interval (0, ©), then there exist 
non-decreasing functions y:(y), ¥2(y),- such that 


Li,[f(*)] = f e“'dyi(y) 
0 
the integrals converging for x >0. In fact 


(10.8) vi = 
0 


In the operation L;,.,[e~*¥] the function e-*” is regarded as a function of 
x, y being a parameter. The existence of the functions y:(y) follows at once 
from Theorem 3.1 since (10.7) implies that L;,,[ f(x) ] is a completely mono- 
tonic function of ¢. To determine the y;(y) explicitly we show first that 


0 0 


The first of these integrals is equal to 


1/k\tH 
—uy p—ky/tayk 
=( ) e kultykdy, 


Make the change of variable wy =/z, and it becomes 


1/k\*! 
f = f Jdy. 
k! Uu 0 0 


Now returning to the computation of y:(y) we have 


= da(u). 


In view of 


1 
= f y, 
x+u 0 


this becomes 


=f dau) fe 
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Interchanging the order of integration, we have 
0 0 


so that the uniqueness theorem for Laplace integrals yields (10.8). 

The integral (10.1) is of use in the theory of continued fractions. It is of 
particular interest there when the function a(#) is non-decreasing and of such 
a nature that the integrals 

f t"da(t) 
0 


all converge. In this connection we establish 
THEOREM 10.3. A necessary and sufficient condition that f(x) should have 


the representation 
da(t) 
fis) = J, 


where a(t) is non-decreasing and of such a nature that the integrals 


(10.9) f 
0 


all converge, is that 
(— 1)*[xf(x)]*» 2 0 (x > 0; & = 0,1,2,---) 
and that f(x) should have an asymptotic development 


(10.10) ©). 
x 


To prove the necessity of the condition it remains only to show the neces- 
sity of (10.10). To prove this it is sufficient to show that 


snda(t) 
lim f = 0 
Ee 0 x + t 


a result which is established under the hypothesis (10.9) in much the same 
way as was done for the case m= 1 in the proof of Theorem 10.1. 

For the sufficiency of the condition we have as in the proof of Theorem 
10.1 


(10. 11) f(x) = f = f (t)dt. 
0 0 
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Moreover, B¢ (0+) =A:. Hence integration by parts gives 
A, 
(10.12) f(x) =—+ f (t)dt. 
x 0 
By (10.10) 
A; 
lim x? [1 = Ae. 
x 


Applying the same arguments to the integral (10.12) as were applied to the 
integral (10.11) we find that 
(0+) = Ae. 


It is now clear how we can prove by induction that 


(0+) = A, 
But 


Bi = f 
0 


v() = (— u)"da(u), 


0 


so that the sufficiency of the condition is established. 

11. Dirichlet series. The methods of the present paper are well adapted 
to the discussion of what functions can be represented in Dirichlet series. We 
prove 

THEOREM 11.1. A necessary and sufficient condition that f(x) can be repre- 
sented in a Dirichlet series 


(11.1) f(x) = 


n=0 


convergent for x>O is that for each positive number « the function f(x+e) 
should satisfy Condition E and that 


lim Ly,s[f(x)] = 0 


in a set of intervals 
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Ar <t<)g, 


the approach being uniform in any closed sub-interval. 
For, if f(x) has the representation (11.1) it also has the form 


(11.3) f(x) = f "e*!dalt), 
0 


where a(#) is constant in each of the intervals (11.2). Hence f(x+) satisfies 
Condition E for each positive « by Theorem 7.2. By use of an earlier result* 


Li,t[ f(x) ] = 0 


in each of the intervals (11.2). 
To prove the approach uniform in the interval a<i<b, where 


An << 


we have by simple computations 


Liz[f(x)] = (— > 0) 


1/k 


Since a(u) is constant in the interval (An, \n+1), we have for ¢ in (a, b) 


LiL f(x)] = + Is, 
k 


k+2 
) — ¢)[a(u) — a(t) |du. 


(FZ) (6 — u)(| a(u)| + M)du, 


where M is an upper bound of | a(é)| in (a, b). Since e»/*(A,/#) is a decreasing 
function of ¢ in (a, 6), we have 


* III, p. 137, Theorem 13. 
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| I, | s f (6 — u)(| a(u) | + M)du. 


It is easily seen that the right-hand member of this inequality tends to zero 
with 1/k. Since it is independent of ¢, J; approaches zero uniformly with 1/k. 
By a similar argument, J; does likewise. Hence the necessity of the condition 
is established. 

Conversely, if the conditions of the theorem are satisfied, f(x) has the 
form (11.3) by Theorem 7.2. It remains to show that a(#) is a step-function. 
By the lemma of §5 we have for any two positive numbers 4; and 


(11.4) — a(t) = lim [f(x) ]du. 


Since L;,.[f(x) ] approaches zero uniformly in any closed sub-interval of the 
interval A,<?<An41, it follows that for any fixed points 4 and #, of that in- 
terval we may take the limit under the sign of integration in (11.4), so that 


= ate). 


This proves that a(t) is a step-function of the type required to make (11.3) 
a Dirichlet series, so that the theorem is proved. 
In a similar way we could prove the result stated in 


THEOREM 11.2. A necessary and sufficient condition that f(x) can be ex- 
pressed as a Dirichlet series 


K(x) = ane 
n=0 


absolutely convergent for x=0 is that f(x) should satisfy Condition A and that 
f(x) ] = 0 


in a Set of intervals 
0 = Xo < t < Ai, 


the approach being uniform in any closed sub-interval. 


12. Determining function of bounded variation. We give here the proof 
of Theorem 2.1 omitted in §2. A preliminary result we state as 
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Lema 1. As x approaches zero the function 
H(z, y) = — (1 — x)¥ 


approaches zero uniformly in the interval DS y<o. 


The function H(x, y) can clearly be expressed as an integral as follows: 


—log(1—z) 
H(x, y) = f e~vydt. 
Since 
Os euse (OSfu<o), 
we have 


—log(1—2) dj — log (1 — x) 
0 A(x, y) = ( ). 


z 


The right-hand side of this inequality is independent of y and approaches 
zero with x, so that the lemma is proved. 
An immediate consequence is 


Lemma 2. For each non-negative value of x 


a 


uniformly for all non-negative integers n. 
We next introduce 


ConpiTi0n A’. A function f(x) satisfies Condition A’ in the interval (0, ©) if 
(a) f(x) is of class in the interval 0 <x<@, 
(b) a constant M exists such that 


| M (0<x< 0), 
k=0 


We now prove 


THEOREM 12.1. A function f(x) which satisfies Condition A’ in (0, ©) is 
analytic for 0<x<0, and 


k! 


for every positive number a. 


For the Taylor expansion of f(x) about the point x=a>0, 
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f®)(q) 
f(x) = (x — a)*, 
k=0 k! 
clearly converges absolutely as a result of Condition A’ in the interval 
0<x<2a. That it converges to f(x) one sees by an investigation of the re- 
mainder, noting that Condition A’ implies that 


Mk! 
| f(«)| < 0), 
x 
The above series is also seen to converge absolutely for x =0. Hence by Abel’s 
theorem f(x) is continuous on the right at x=0 if we define f(0) as 
(— a)* 
f(0) = f(a) ——- 
k=0 k! 
In the remainder of this section we shall take this as the definition of f(0). 
We turn next to the proof of 


THEOREM 12.2. If f(x) satisfies Condition A’ in (0, ©), then for each non- 
negative x the series 


(12.1) a(x) = > (- (a > 0) 


kmo 
converges and 
(12.2) lim $.(%) = f(x). 


The series (12.1) is clearly dominated by the convergent series 
sa) | = 
k=O k! 
so that its convergence is assured. By Theorem 12.1 
— a)* 


f(x) = f(a) 


k=0 k!} 


for |x—a| <a, a>0. Then by Lemma 2 of the present section we have 


lim [f(x) — ¢a(x)] = 0 
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for each fixed value of x greater than zero. This completes the proof of the 
theorem. 
By use of this result we are able to prove 


THEOREM 12.3. Condition A’ is necessary and sufficient that f(x) can be ex- 
pressed in the form 


(12.3) f(x) -f e~**da(t) (x > 0), 
0 


where c(t) is of bounded variation in the interval (0, ©). 

We first prove the necessity of the condition. Let f(x) have the form (12.3). 
Then the integral converges for x =0 and f(x) is analytic* in0<x< ©, so that 
A’ (a) is satisfied. 

It remains to prove A’ (b). We have at once 


k tx k 
k=0 k! k=0 

where V(x) is the total variation of a(#) in the interval 0<t<-x. Applying 


any one of several familiar tests, we see that it is permissible to integrate 
the series 


) 


= (tx)* 


> 


k=0 k!} 


term by term with respect to V(#) and obtain 


s dV(t) =M 


k=0 


so that Condition A’ (c) is satisfied. 
To prove the sufficiency of the condition we note first that the function 
$a(x) of Theorem 12.2 can be expressed as a Laplace-Stieltjes integral 


a(x) -f e~*"daa(t), 


a,(0) = 0, 


= f(a) 


where 
4 (12.4) (t> 0). 
*T, p. 702. 
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The total variation of this function in (0, ©) is clearly equal to 


a* 
and this by hypothesis is not greater than M. By Theorem 12.2 


f(x) =lim | e-#"da,(t). 
0 
By a theorem of E. Helly* we can pick from the set of functions a,(é) 
(a=0, 1, 2,---), all of variation not greater than M, a sub-set az,(t) 
(=0, 1, 2, - - - ), which approaches a limit a(¢), also of variation not greater 
than M in (0, «). Then 


f(x) = lim e~*"daa,(t). 
0 
By the Helly-Bray Theorem} we may take the limit under the sign of integra- 
tion and obtain 


fla) = 


as stated in the theorem. The proof of the theorem shows that the smallest 
possible value of M in Condition A’ is the total variation of a(¢) in (0, ©). 

In addition to giving a new proof of Theorem 2.1 the present methods 
give a new proof of Theorem 3.1. For, if f(x) is completely monotonic in 
0<x< © wesee at once by Taylor’s series that 


bed k 
= f0 +) (0<a<~), 


so that Condition A’ is satisfied. Hence f(x) has the form (12.3). By (12.4) 
a(t) is non-decreasing and 


= f(0). 


The same must therefore be true of the limit function a(#) so that the proof 
of Theorem 3.1 is complete. 
Finally, we prove 


THEOREM 12.4. Condition A and A’ are equivalent. 


* E. Helly, Uber lineare Funktionaloperationen, Wiener Sitzungsberichte, vol. 121 (1921), p. 265. 

ft See, for example, G. C. Evans, The Logarithmic Potential, Discontinuous Dirichlet and Neu- 
mann Problems, Colloquium Publications of the American Mathematical Society, vol. 6, 1927, 
p. 15. 
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Let us first show that A’ implies A. Assuming that A’ is satisfied we have 
from Theorem 12.1 


n=0 


n=0 


for any positive ¢ less than a. The inequality is strengthened if ¢ is replaced 
by 0 on the right-hand side, and the series continues to converge, for its 
value is 


an 


n=0 


>> | f(a) |= — 


This series converges by virtue of A’. Hence we have 


a yk a* 
f — | du SM, 
o par) k! 


f | du < M, 
0 


and A is satisfied. 
We show conversely that A implies A’. Supposing that A is satisfied, we 
see* that 


lim x*f(x) = 0 


and hence that 


f(a) — f(~) = (- (k = 0, 1, 2, eee 


a 


Further we see by differentiation times, or by applying the above formula 
to f‘»(a), that 


* III, p. 139. 


& 
Hence 
| 
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(k = 0,1,2,---), 
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f(a) = (-— f (p < k). 


Consequently 


P (t — a)*-?a? 
| <f | (p 


k k 


a? k 
x < (¢ — | dt +| f()| 


© 
=f at | s+ | 


<M+|s(@)|, 


p=0 
so that the theorem is proved. 
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FUCHSIAN GROUPS AND ERGODIC THEORY*+} 


BY 
EBERHARD HOPF 


Introduction. Let 2 be the phase space of a dynamical system. We sup- 
pose that every motion can be continued along the entire time-axis. Thus we 
are concerned with a steady flow in 2. The following concepts are of funda- 
mental significance for the study of dynamical flows. 


(a) There exists a curve of motion everywhere dense on 2. 


The existence of such a motion is known under the name of regional transitiv- 
ity. We now suppose that a measure m in the sense of Lebesgue invariant 
under the flow exists on 2. Such a measure is usually defined by an invariant 
phase element dm. The following property is stronger than (a). 


(b) The curves of motion not everywhere dense on Q form a point set on 2 
of m-measure zero. 


Still stronger and more important than (b) is strict ergodicity. We suppose 
m() to be finite. 


(c) Let f(P) be an arbitrary m-summable function on @. The time-aver- 
age of f(P) along a curve of motion is then, in general, equal to {, f(P)dm/m(Q), 
the exceptional curves forming a point set on 2 of m-measure zero. 


How these concepts are interrelated is seen most clearly if we state them in 
the following way. 

(a’) Every open point set on Q that is invariant under the flow is every- 
where dense on 0. 


(b’) Every open point set on Q that is invariant under the flow has the 
measure m(Q). 


(c’) Every m-measurable point set on © that is invariant under the flow 
has either the m-measure zero or m(Q). 


The latter property of a flow is called metric transitivity. Its importance rests 


* Presented to the Society, September 13, 1935; received by the editors August 10, 1935. 

t To August Kopff. 

¢ G. D. Birkhoff and P. Smith, Structure analysis of surface transformations, Journal de Mathé- 
matiques, (9), vol. 7 (1928), pp. 345-379. 
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in its equivalence to (c).* The problem whether a given flow is metrically tran- 
sitive or not is, in general, as difficult as it is interesting. Beyond simple ex- 
amples progress has recently been made in the direction of certain geodesics 
problems. Let = be a surface (two-dimensional manifold) of class C;. We 
denote by # an arbitrary point on © and by ¢ the angle measuring directions 
through p. Every geodesic on = is supposed to be continuable indefinitely 
in both directions. The line elements 


then constitute the phase space associated with =. To the uniform motion 
along the geodesics on = there corresponds a steady flow on &. The element 
of volume 


dm = ded¢, 


do being the element of area, is well known to be invariant under the flow. 

The particular surfacest = considered in this paper are those of constant 
negative curvature and of finite connectivity. Their geodesics are supposed to 
satisfy the above condition of unlimited continuability. Differential geometry 
shows that there exists a one-to-many correspondence between = and the 
interior | z| <1 of the unit circle such that the elements of length ds and area 
do go over into the NE-elements in |z| <1, 


ds = 2(1 dz|, do = 4(1 — 22)-*dxdy 


respectively. The geodesics on = go over into the arcs of orthogonal circles 
within |z| <1 (NE-straight lines). The covering transformations are known 
to form a Fuchsian group G of linear substitutions S transforming |z| <1 
into itself. |z| =1 is the principal circle of G.{ A more general notion of the 


* See the literature on the ergodic theorem, viz. 

G. D. Birkhoff, Proof of a recurrence theorem for strongly transitive systems, Proceedings of the 
National Academy, vol. 17 (1931), pp. 650-660. 

T. Carleman, A pplication de la théorie des équations intégrales linéaires aus systémes d’équations 
différentielles non-linéaires, Acta Mathematica, vol. 59 (1932), pp. 63-87. 

E. Hopf, On the time average theorem in dynamics, Proceedings of the National Academy, vol.18 
(1932), pp. 93-100. 

A. Khintchine, Zu Birkhoff’s Lisung des Ergodenproblems, Mathematische Annalen, vol. 107 
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Fuchsian group will be considered here. In order to include one-sided surfaces 
we shall admit anti-analytic substitutions, i.e., analytic substitutions of 2. 
On identifying all points of |z| <1 equivalent under any Fuchsian group G 
one defines, conversely, an associated surface 2. 

G always possesses a fundamental region R. It can be chosen so as to 
form a NE-convex polygon bounded by a finite number of segments of NE- 
straight lines and a finite number of arcs of |z| =1. The images of R under 
all S of G cover the whole of |z| <1 simply. 

We are to distinguish between two fundamentally different kinds of 
Fuchsian groups G, and, therefore, of surfaces 2. G and = are of the first 
kind if the surface area of = or, what is the same, the NE-area of the funda- 
mental region R is finite. In the opposite case, (2) =, we speak of the 
groups and surfaces of the second kind.* If G' is of the first kind, R has no 
arcs of |z| =1 on its boundary. Its vertices lie partly in |z| <1, partly on 
|z| =1, the angle being zero in the latter case. A well known example is offered 
by the case where R is bounded by a regular NE-polygon with the sum of the 
angles equal to 27. If the 4p sides (p >1) be paired in a certain way, = repre- 
sents a closed two-sided surface of genus p. Another well known example is 
furnished by the modular group where R is bounded by a NE-triangle with 
one vertex on |z| =1. The surface = has, in this case, a cuspidal singularity. 
For every group G' of the second kind, however, R has at least one arc of 
|z| =1 on its boundary and > possesses, accordingly, at least one funnel. 

For surfaces = of the first kind, the regional transitivity (a) has been 
proved,f in various degrees of generality, by Artin, J. Nielsen, Koebe and 
Loébell, whereas Myrberg discovered the property (b). It is only recently 
that Hedlund{ succeeded in proving the deeper property of metric transi- 
tivity of the two examples mentioned above. It is the purpose of the present 
paper to develop an entirely novel and simple method that yields a proof of 
the metric transitivity for all surfaces = of the first kind. 


* This definition is readily found to be in agreement with the one usually given. 

7 E. Artin, Ein mechanisches System mit quasiergodischen Bahnen, Abhandlungen des Mathe- 
matischen Seminars, Hamburg, vol. 3 (1924), pp. 170-175. 

J. Nielsen, Untersuchungen zur Topologie der geschlossenen zweiseitigen Flichen, Acta Mathe- 
matica, vol. 50 (1927), pp. 189-358. 

P. Koebe, loc. cit., IV (1929), p. 414. 

F. Lébell, Uber die geoddtischen Linien der Clifford-Kleinschen Flichen, Mathematische Zeit- 
schrift, vol. 30 (1929), pp. 572-607. 

P. J. Myrberg, Ein A pproximationssatz fiir die fuchsschen Gruppen, Acta Mathematica, vol. 57 
(1931), pp. 389-409. 

¢ G. Hedlund, Metric transitivity of the geodesics on closed surfaces of constant negative curvature, 
Annals of Mathematics, (2), vol. 35 (1934), p. 787; A metrically transitive group defined by the modular 
group, American Journal of Mathematics, vol. 57 (1935), pp. 668-678. 
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THEOREM. For all surfaces = of the first kind, the flow associated with the 
geodesics problem on = is metrically transitive. 


On surfaces = of the second kind (treated in §5), the geodesics show an 
entirely different behavior. The corresponding theorem can be stated without 
reference to the phase space 2. 


The geodesics through an arbitrarily given point p of = disappear, for almost 
all directions through p, into the funnels of =. 


By this we mean that the corresponding NE-straight lines end on one 
of the arcs of |z| =1 belonging to the boundary of R, or on one of the images 
under G of those arcs. The theorem, therefore, merely states that those arcs 
and their images form a set on |z| =1 of the same measure as the unit circle 
itself, which statement, in contrast to the theorem concerning surfaces of 
the first kind, is most readily proved. 

The essential tools used in this paper are potential theory and the NE- 
metric in |z| <1. 

1. Preliminaries on Fuchsian groups. The cross ratio 


23 — 21 24 — 22 


[z1, 22, 23, 24] 
23 — 22 24 21 
is unchanged by an analytic linear substitution, whereas it goes over into its 
conjugate value under an anti-analytic one. The same holds for the differ- 
entials 
 %3— 


(1) [z1, Z2, 23, Z2 + 
21 — 22 23 — 22 


and 
(2) (2, — = — [2, ze, 21 + 22 + dze]. 


Only those substitutions S will be considered in the sequel which leave |z| <1 
invariant. The relation 


S(1/Z) = 1/ S(z) 
yields the invariant 


(3) = [s, 1/2, w, 1/w], 


1 — Zw 


and, according to (1), the differential invariant 


dz 1 - Zw 
(4) = [1/2, z, w,2 + dz]. 
1—22 w-—z 
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From (3) and (4) we obtain Poincaré’s invariant NE-element of length 
(5) ds = 2(1 — 22)“ | dz]. 

The corresponding NE-element of area is then 

(6) do = 4(1 — 22)~*dxdy. 


The absolute value of (2) with 2:=¢, z2=2, after being divided by (5), finally 
furnishes the invariant Poisson differential 
(7) 

A group G of substitutions S preserving |z| <1 is called Fuchsian if it is 
infinite and discontinuous, i.e., if it contains no infinitesimal S. From now on, 
we suppose G to be Fuchsian. A point z, |2| <1, and all its equivalent points 
S(z) determine the same point of the surface 2. Associated with G'is a group 
I of contact transformations T in the space of the line elements 


(z,¢), = arg (dz), 
where T has the form 


(8) (2, > (S(z), @ + arg S’(z)) 


in the case of an analytic S and a similar form when S is anti-analytic. Equi- 
valent line elements define the same point P in the phase space . As S pre- 
serves angles, the transformations T leave the volume element 


(9) dm = dodo 
in the space of line elements invariant. 
We now introduce the new coordinates 
(m, 12, 8); |m| =| =1, m#¥m, <s<o, 


in the space of line elements. 7 and 72 are initial and end point, respectively, 
of the sensed NE-straight line passing through (z, ¢). Let s denote the NE- 
distance of z from the point 2» bisecting the circular arc (m1, 2), the sign of s 
being plus (minus) if z is met after (before) z) on (m, 72). The correspondence 
between (z, ) and (m, 72, s) is evidently one-to-one. In the new coordinates, 
the transformations T are easily seen to be of the form 


(10) (m1, 2, (S(m), S(n2), s+ fs(m, n2)). 


We now prove that the volume element (9) becomes 


d d 
m| | ml 
| m1 — ne|? 


(11) dm 
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k being a positive constant. Indeed we can always find a linear substitution 
preserving |z| =1 which transforms an arbitrarily given line element 


(z, ¢) (m, N2, 
into any other one, 
(2’, = (ni, nz, 5’). 
The associated contact transformation is of the form (8) and therefore leaves 
(9) invariant. Being of the form (10), it also leaves invariant the right-hand 
side of (11) in view of the invariance of (2) and (5). Hence, the two sides 
can differ by a constant factor only. 


In the new coordinates the flow associated with the geodesics on © is de- 
scribed by the simple formulas 


(12) P= (m, 2, s) (m1, 5 + t). 


The invariance of dm under the flow is now a trivial consequence of (11) 
and (12). 

The explicit connection between the coordinates is readily established, 
m + 2 

2+ | m— 


log [m, N2, 2, zo], Zo 


(2 — m1)(z — m2) 
m1 — 2 


= arg 


but it is not needed for our purposes. 

2. Another formulation of the theorem. A Fuchsian group of the first kind 
possesses a fundamental region R of finite NE-area. To the subdivision of 
|z| <1 into the NE-congruent parts S(R) there corresponds a subdivision of 
the (m, 72, s) space into cells congruent to each other under the transforma- 
tions T of I’. Each of these cells is a representative of 2, with 


m(Q) = << 
For the proof of the announced theorem it is sufficient to prove 


THEOREM A. A point set A on the (m, n2) torus which is measurable in the 
sense of ordinary Lebesgue measure, for which 


ff | ana] > 0 
A 


and which is invariant under the simultaneous substitutions S(n:), S(n2) of G, 
has the measure of the entire torus. 
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Suppose Theorem A to be proved. We start with a point set on © satisfy- 
ing the hypothesis of the theorem announced in the introduction. According 
to (12), this set represents, in the (m, m2, s) space, a cylindrical set, i.e., a set 
on the (m, m2) torus. According to (11), the part of this set within each of the 
above cells is of positive measure 


f | | | dne| | dna| 
Tm — — Ne 
and, therefore, of positive torus measure. The sum of these parts obviously 
represents a set A in the sense of Theorem A. Hence its complement has the 
torus measure zero. Regarded as a cylindrical set in the (m, m2, s) space the 
latter set is, necessarily, of m-measure zero in that space and, therefore, in 0. 
From now on we may without loss of generality confine ourselves to two- 
sided surfaces =, i.e., to the case where G contains only analytic substitutions 
S. For, let S denote the analytic and S the anti-analytic substitutions of G. 


The S’s form a subgroup g of G' and each S can be written in the form S=SSp 
where Sy is a fixed S. As 


R + So(R) 


is evidently a fundamental region for g, this group is seen to be again a 
Fuchsian group of the first kind. If Theorem A holds for g, it holds also for G. 

Let now U(m, 72) be the function on the torus which equals zero on the set 
A of Theorem A and one elsewhere. U is measurable and invariant under G, 


(13) U(S(m), S(n2)) = U(m, 


It is to be proved that U =0 except on a torus set of measure zero. We trans- 
form our problem once more by introducing harmonic functions. The Poisson 
integral 


1 
(14) ven=—f 
2m J —s| 
represents, for amost all y on || =1, a harmonic function of z in |z| =1 and, 
for every such z, a bounded and measurable function of y on |y| =1. Further- 
more, the function 


1 1— 
(14’) U(z, w) = — 
ly— 


lyl=1 


is, for |z| <1 and |w| <1, harmonic in z as well as in w. (14’), combined with 
(14), could, of course, be written as one double Poisson integral. In view of 


% 
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the invariance of the Poisson differentials, the invariance (13) under G of 
the “torus values” of U(z, w) implies the invariance of the function itself, 


(15) U(S(z), S(w)) = U(z, w), 


for all S of G. 

Formulas (14) and (14’) show that U(z, w) =0 implies the vanishing of 
the torus values U(f, y) up to a torus set of measure zero. To prove Theorem 
A it therefore suffices to prove 

THEOREM B. Suppose that U(z, w)=0 is bounded and harmonic in z as 
well as in w, |z| <1, |w| <1, and that U satisfies (15) for all S of G. If the 
torus values of U vanish on a set of positive torus measure then U vanishes iden- 
tically. 

We have to specify in what sense the torus values U(¢, y) may be re- 
garded as limit values of U(z, w). If u(z) is a bounded harmonic function of 
a single point z, |z| <1, we have 


(16) l.i.m. u(rf) = u(¢) 


on |¢| =1.* An analogue for harmonic functions of two points is quite simi- 
larly proved, 


(16’) 1.i-m. U(rg, py) = v) 


on the torus |¢| =|y| =1. 

3. Auxiliary theorems. In the sequel we denote by K; the interior of the 
circle about z =0 with the NE-radius /. A simple computation of the NE-area 
yields the formula 


(17) o(Ki) = + — 2). 
For the validity of Lemma 1 we assume that z=0 is interior to some R.T 


Lemna 1. If a set B in |2| <1 is invariant under all S of G, and if R is a 
fundamental region for G, then, for | sufficiently large, 


o(BKi) 
< ao(BR), 
(Ki) 
a being a positive constant depending only on G.. 


This lemma is not quite trivial in the case where R has vertices on |z| =1 
and is used mainly to take care of the slight complications arising from this 
* |.i.m. means limit in the mean of order two. 


{ If it is not, by a suitable linear substitution, we can always move a given interior point of R 
into the origin. It may well be mentioned that the origin plays here a mere auxiliary role. 
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case in the proof of Theorem B. The proofs of this and of the following lemmas 
2 and 3 are given in the next §4. 


Lema 2. Let u(z) 20 be bounded and harmonic in |z| <1, and let 
u(S(z)) = u(z) 
hold for all S of G. If the boundary values of u on |z| =1 vanish ona set of posi- 
tive measure, u(z) vanishes identically. 


This is a very simple special case of Theorem B. The principal difficulty 
in the proof of that theorem is surmounted by the main 


Lemna 3. If U(z, w) satisfies all the hypotheses of Theorem B, the measure 
on |y| =1 of the set where U(0, y) =0 is necessarily positive. 


Proof of Theorem B. The set E on |y| =1 where U(0, y) =0 is of positive 
measure according to Lemma 3. We now make use of Harnack’s inequalities 
for a non-negative harmonic function u(z), |z| <1, 


where s(z, 2’) denotes the NE-distance of the two points. These inequalities 
being applied to U(z, w) =0 yield 
(18) U(0, w) S U(z, w) S U(0, w), 
which shows that, for a fixed z, the set where the boundary values U(z, +) 
of U(z, w) on |w| =1 vanish is independent of z; in fact it coincides with E 


except for a set of measure zero. Now according to (15) we have, for an arbi- 
trary Sof G, 


U(S(0), w) = U(O, S~(w)). 


On replacing w by S—!(w) and y by S-'(y) in (14’) and on taking into account 
the invariance of the Poisson differential, we infer from (14’) that 


1 1 — ww 
U(0, S-(w)) = — U(0, S-y)) ———_ | dy]. 
2 ly — w|? 


Hence U(0, S-!(y)) are the boundary values of U(0, S-!(w)). Since S(E) is 
the set where these boundary values vanish, the equation S(Z£) = £ holds for 
all S of G apart from a null set on the unit circle. Considering, in the same 
way as before, the harmonic function u(z), |z| <1, whose boundary values 
are zero on £& and one elsewhere, we infer that u satisfies the hypothesis of 
Lemma 2 and, therefore, that ~=0, i.e., that E has the measure of the entire 
unit circle. It then follows from the definition of EZ that the boundary values 


i 
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of U(0, w) vanish almost everywhere. Hence U(0, w) =0 and, according to 
(18), U(z, w) =0, which is the desired result. 
4. Proof of Lemmas 1 and 2. We denote by N(z, /) the number of points 

S(z) congruent to z which lie in K;. We first show that, for / sufficiently large, 
N(z, 1) 

19 <4, 
(19) o(Ki) 
where a>0 depends on G' only. N(z, 1) is the number of points S(z) whose 
NE-distance from the origin does not exceed /, 


s(0, S(z)) 
Since 
s(0, S(z)) = s(S—1(0), 2), 


N(z, l) is not greater than the number of points S—'(0) congruent to the origin 
with a NE-distance </ from z, provided that the points S-'(0) are different 
for different substitutions S. This is the case, as the origin is interior to R 
and as an interior point of R cannot be a fixed point for any S except for the 
identity transformation. We furthermore know that 


s(S(0),0) > b, S(O) #0, 


holds, where b >0 depends on G' only. Therefore a circle of NE-radius 6 about 
any point congruent to the origin contains no other such point. This implies 
that the number of the different points S(0) with a NE-distance </ from z 
is less than the number of mutually enclusive circles of NE-radius 6 which 
can be placed within a circle of NE-radius /+5. Hence 


o(K1)o(K») 


Here the first factor tends to r—!(1+e-” — 2e~*)-! as + @ , which proves (19). 
We now return to the set B of Lemma 1. By means of the function 


) in Ki, 
=> 
0 elsewhere, 


N(z,1) < (Ki). 


we obtain 


N(z, 1) = o(S(2)), 


and therefore 


iq 

4 

| 

s 
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Since B = S(B) for all S of G, 
= BS-(R) = BY S(R) = B, 
s 8 


and the right hand side of (20) equals 


(20’) Jf = o(BKi). 


Lemma 1 obviously follows from (19), (20) and (20’). 

Proof of Lemma 2. In the particular case where the boundary of R lies 
within |z| <1 the lemma is obvious, since then u(z) attains its extrema at 
some points of |z| <1, i.e., at interior points of |z| <1. If R has vertices on 
|z| =1 an elementary proof could still be given. We prefer, however, to use 
tools which seemed unavoidable in the further course of the proof of Theorem 
B. The auxiliary function 


(1 t/e)3, 


is concave and possesses a continuous second derivative, /=0. We first show 
that 


K, € Zz x € 


o(Kz) 


and that 


(23) lim heu(g)) | dg | = meas [u(s) = 0]. 


The integral average on the left of (22) is, evidently, an average of 


(24) 

over a certain range of r corresponding to the range from 0 to / for the NE- 
radius. In this average large values of |, i.e., values of r/ near one, are of domi- 
nating weight. Since, according to (16), (24) tends to the right-hand side of 
(22) as r—1 it follows that (22) must also be true. Finally, (23) follows from 
the obvious inequalities 


meas [u(¢) = 0] < f h.(u(¢)) | dt | <= meas [u(¢) 
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For later purposes, we need the analogous relations resulting from (16’). On 
setting 


we obtain quite similarly 
(26) lim lim = (4x?)-! meas [U(¢, y)] = 0. 
ly|=1 


Returning to the proof of Lemma 2 we note that the integral average on 
the left in (22) is less than 
(27) 

» | sufficiently large, 

where B, is the set of all z in |z| <1 satisfying u(z) <e. The invariance under 
G of the function u(z) implies that of the point set B.. By Lemma 1, (27) 
and therefore the left-hand average in (22) does not exceed the value 


(28) ao(B.R). 


From the hypothesis of the present lemma, viz. that the right-hand side of 
(23) is positive, it then follows that (28) remains, for all e>0, above a posi- 
tive constant. In particular, the set common to all B, is not empty. There 


exists therefore a point in |z| <1 where u=0, i.e., where u20 attains its 
minimum, which completes the proof of the lemma. 

5. Proof of Lemma 3. We first confine ourselves to the simpler case where 
the boundary of the fundamental region R of G lies entirely in |2| <1. Of 
all the images S(R) we call Ry the particular one that contains the origin in 
its interior or on its boundary, 


(29) Oc Ro. 


R, and all its images have the same finite NE-diameter D. We enumerate all 
these congruent parts of |z| <1 in an arbitrary way, Ro, Ri, R2,--- , and 
we call S, the substitution of G' that transforms R, into Ro, 


(30) S(R,) = Ro. 


We now consider all R, lying entirely in the closed circular disc K;. They evi- 
dently cover the whole of K;_p. From (25) we then obtain 


where 


(31’) q(l) = [o(Ki)/o(Ki-v) 
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and where integration and summation are carried out so as to satisfy the con- 
ditions 
(31’’) zcR,c Ki, wcR,c K, 
in all possible ways. By (29), (30), and (31’’), 

s(0, S,(z)) D, 
whence, by Harnack’s inequality, 

U(z, w) = U(S,(z), S,(w)) 2 e?U(0, S,(w)), 
and, as /,(#) nowhere increases, 
(32) h(U(s, w)) hefePU(0, S(w))}, 
Now, the function of w 

(33) h.{eU(0, w)} 


is a concave function of a harmonic function and, therefore, subharmonic in 
|w| <1. This is most easily proved by verifying the non-negativeness of the 
Laplacian.* Hence, (33) nowhere exceeds the function V(w)=V.(w) which 
is harmonic in |w| <1 and which possesses, on |w| =1, the same boundary 
values. The slight difficulty brought about by the fact that these boundary 
values are merely measurable is readily surmounted by considering first 
smaller circles and by proceeding then to the limt as r—1. We note that 


1 
(34) V0) =~ he{ e-U(0, y)} | dy|, 


and that, by (32), 
(35) h(U(z, w)) S V(S,(w)). 
From (31), (31’’), and ee we obtain, with regard to V.=0, 


S SS. V(S, (1))doe} dor, 


Since V(S,(w)) is harmonic in |w| <1, we have by Gauss’s mean-value theo- 
rem, 


* This is where the existence and continuity of h,’’ is used. 


o(K1) 
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whence 
M.(l — D) o(R,)V(S,(0)) 
€ =>q o(K,) y 


(36) 


the summation being extended over all v for which R, ¢ K;. On setting 

(37) S,(Ro) = R/, 

we infer from (29) that 

(38) ¢ RY. 

For two different R, the corresponding R/ are obviously different. Further- 
more, it follows from (30) and (37) that the NE-distance of R/ from R» is 
the same as that of R, from Ro. According to (31’’) the NE-distance of R, 


from Rp is less than /. Hence all regions R/ considered here must lie within the 
circle Kii2». Thus (36) can be written as 


(38) D) ff 
o(Ki) > R, 
where 
(38’) S,(0)¢ ¢ 
By (38’) on applying Harnack’s inequality to V20 we have 
V(S,(0)) e?V(z), zcR;, 
which being combined with (38) yields 


o(Ki) 
ePq(l) o(Ki,2p) 
V(z)do, = g(l)e? V(0), 
and, according to (17), 
lim M.(1) < e&?V(0). 


lew 
On taking account of (26) and (34) we infer from this inequality that 


(39) meas [U(¢, y) = 0] S meas [U(O, y) = 0], 
ly|=1 


which proves Lemma 3 in the case where R has no vertices on |z| =1. 


[March 
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In the general case by cutting off the vertices lying on |z| =1 we can al- 

ways divide Ry into two parts, 

(40) Ro = Ro& + 

such that 

(41) o(Ro™*) <6 


and that Ro“ has a finite NE-diameter, say D=D(5). We may always sup- 
pose Ro* to contain the origin. The set 


B= > S(Ré*) 
Ss 


is invariant under G’. The set Ro and all sets R* congruent to it cover the 
complement of B in |z| <1. We note that all R* have the same NE-diameter 
D. Consider all 


Every point of K:_p belongs either to one of these R¥ or to B. For, if it be- 
longs to any R* at all, this set must be contained in K; since its diameter 
equals D. Hence 


Kipe¢ >, R*¥ + Ki. 


Since h, <1 we obtain from (25) 


the summation being confined to all 
Ki, Ric Ki. 


Here the second term satisfies the same inequalities as before. The first term 
is less than 


a’o?(BRo) = a®a?(Ri*) < 


by Lemma 1 and (41). On proceeding as before to the limit as], e—0, 
we obtain inequality (39) with the additional term a?6 in the right-hand side. 
A suitable choice of 6 evidently leads to the general proof of Lemma 3. Theo- 
rem B, and therefore the proposed theorem on surfaces = of the first kind, is 
herewith completely proved. 

6. Surfaces of the second kind. For a group G' of the second kind, the 
fundamental region has on its boundary one or several arcs of the unit circle. 
We shall consider only the case where R has no zero angle vertices on |z| =1, 
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i.e., where the surface = has no cusps. These arcs and their images under G' 
are known to lie everywhere dense on |z| =1. We shall prove that this set w 
has the measure of the whole of | z| =1. Let a be one of the complete arcs of 
|z| =1 belonging to the boundary of R. There will be two images of R ad- 
jacent to R along the two sides of a which end at the two end points of a, 
respectively. In particular, there are two arcs of the set w immediately ad- 
jacent on both sides of a. This shows that the end points of any arc a are 
interior points of the set w introduced above. Since w is invariant under G,, 
the Poisson integral u(z) whose boundary values on |z| =1 are zero on w and 
one elsewhere must also be invariant under G’, 


u(S(z)) = u(z) 


for all S of G. All we have to prove is that «=0. Indeed, u(z) has, in the sense 
of ordinary convergence, the boundary value zero on every closed arc a. On 
account of its invariance, u takes all its values in R. Since a harmonic func- 
tion always attains its extrema on the boundary, u(z) must attain them on 
the (closed) part of |z| =1 belonging to the boundary of R, whence u=0. 

If R has vertices on |z| =1 an elementary proof could still be given as 
well as for Lemma 2, for instance by applying Green’s formula 


ou 
ff grad? udady = 
on 


to a region obtained by diminishing R suitably (cutting off the vertices on 
|z| =1 in a suitable way). 
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ON ASYMPTOTIC DISTRIBUTIONS OF ARITHMETICAL 
FUNCTIONS* 


BY 
I. J. SCHOENBERG 


INTRODUCTION 


1. The present note was suggested by recent work of H. Davenport, [3],t 
S. Bochner and B. Jessen, [2], and A. Wintner and B. Jessen, [6]. Davenport 
established the existence of asymptotic distribution functions for a certain 
class of arithmetical functions by an extension of a method previously used 
by the author, [8], [9], in a similar investigation. This method was based 
on the consideration of the moments of the distribution functions. In ques- 
tions of asymptotic distribution, however, Bochner and Jessen have shown 
the great advantage of dealing directly with the Fourier transforms of the 
distribution functions. This advantage becomes again apparent if the method 
of Fourier transforms, whose adaptation to sequences is fully developed in §I, 
is applied to Davenport’s problem. This is precisely what we shall do in §IT; 
the result thus obtained (Theorem 1) insures the existence of the asymptotic 
distribution function for a very large class of (positive and multiplicative) 
arithmetical functions. It includes Davenport’s and the author’s previous 
results and yields readily (by suitable specializations of the arithmetical func- 
tion involved) the frequencies of certain classes of integers investigated by 
W. Feller and E. Tornier, [4], in an entirely different way. 

The connection with the work of Wintner and Jessen, [6], is as follows. 
The distribution function w(x) =x(e") of Theorem 1 is a special example of 
the infinite convolutions of purely discontinuous distribution functions in- 
vestigated by these authors. They have shown ([6], Theorem 35) that such 
infinite convolutions can be only either purely discontinuous or else every- 
where continuous, and in the latter case either singular functions or else ab- 
solutely continuous functions. These general results apply immediately to 
our special situation, but new and probably difficult problems arise which 
may be mentioned here. Theorem 1 gives simple sufficient conditions to in- 
sure continuity or discontinuity of w(x); the problem of finding similar neces- 
sary and sufficient conditions for continuity remains unsolved. The more deli- 
cate problem of deciding whether a continuous w(x) is singular or absolutely 


* Presented to the Society, March 31, 1934; received by the editors July 25, 1935. 
t Numbers in brackets refer to the Bibliography at the end of this paper. 
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continuous is likewise unsolved. It can be shown that a continuous w(x) is 
necessarily singular if log f(p,) =O(k-") (p,=mth prime, k>1). However, I 
do not have any example of an absolutely continuous w(x). 


I. ASYMPTOTIC DISTRIBUTIONS OF REAL SEQUENCES 


2. Let us recall first a few well known definitions. A finite or infinite class 
C of increasing positive integers m, m2, ms, - - - is said to have a frequency (or 
density) F{C} if 

lim (1/n) 1 = F{C}. 

In case this limit does not exist then the upper limit of the same expression 
is the upper frequency F {C} and the lower limit is the lower frequency F{C} 
of the class C. A function w(x) defined for — © <x< #, which is monotonic 
with w(— ©) =0, w(«) =1, is called a distribution function (d.f.). A real se- 
quence 21, %2, X3,--- is said to have an asymptotic distribution function (ab- 
breviated: a.d.f.) w(x) if for every point of continuity x=£ of w(x) we have 


(1) F {xn &} = w(€). 


In this relation F{x,<£} means the frequency of the class of integers m for 
which x, <£. For example we have so called equi-distribution* in the inter- 
val (0, 1) if the above definition holds with w(x) =0 for x<0, =x forO0<x<1, 
=1 for x>1. 

Let N(x, <x), denote the number of those elements among the first m ele- 
ments of the sequence {x,} which are <x. With our sequence we may con- 
nect a sequence of distribution functions (step functions) 


w,(x) = (1/n)N(x% S x)n <x< o;n=1,2,3,---). 


A comparison with our previous definition shows that the d.f. w(x) is the a.d.f. 
of our sequence {x,} if and only if the relation 


(2) lim w,(£) = w(€) 
holds for every continuity point x=& of w(x). The limiting relation (2) is usu- 
ally described by saying that the sequence of d.f. w,(x) converges essentially 
to the d.f. w(x). 

Throughout this note we write 


g(a1) + g(x%2) +--+ + g(%n) 


n 


=Ma{g(x)}, lim M,{g(x)} = M{g(2)}, 


* H. Weyl, [10]. 


hy 
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provided the last limit exists. An important property of asymptotic distribu- 
tions of sequences is contained in the following 


Lemma 1. Let the sequence {x,} admit the a.d.f. w(x) and let g(x) be a 
bounded continuous complex-valued function for <x< Then 


(3) M{g(2)} = 


Since 


M,{ g(x)} = f g(x)dw,(x), 


the relation (3) is a special case of a theorem of P. Lévy.* 
3. The following theorem gives a criterion for asymptotic distributions of 
sequences. 


Lemma 2. Necessary and sufficient conditions that a sequence {x,} shall 
have an asymptotic distribution are as follows: The mean value 


(4) M {eit=} = lim +--+ + etter) = L(t) 
no 


shall exist for every real t and be an everywhere continuous function of t. If these 
conditions hold then L(t) is of the form 


(5) Lit) -f 


where w(x) is the a.d.f. of our sequence {x,}. 
The necessity of these conditions is a consequence of Lemma 1. In view 
of (4) we have 


Le — = = ut} > 


uy 
m 2 
D ei*p, 2 0, 


hence L(t) is a positive-definite function which, being assumed continuous, 


* P. Lévy, [7], pp. 195-196. Lévy’s theorem is as follows: If a sequence of d.f. wa(x) converges 
essentially to a d.f. w(x), then 


for every bounded continuous g(x). 


2 
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is of the form (5) with a non-decreasing w(x) uniquely defined by w(— «) =0.* 
For t=0 we get w(+ ©) =L(0) =1, hence w(x) is a d.f. From (4) we infer 


(6) lim e**zdw,(x) -{ e*'tdw(x) for all real ¢, 
and a theorem of P. Lévyf insures the relation (2), i.e., w(x) is the a.d.f. of 
our sequence. 


II. ASYMPTOTIC DISTRIBUTIONS OF MULTIPLICATIVE 
ARITHMETICAL FUNCTIONS 


4. Let f(m) be a multiplicative arithmetical function, that is, a function 
defined for »=1, 2, 3,--- and satisfying the relations 


(7) f(mn) = f(m)f(n) if (m,n) =1, = 1. 


As an immediate consequence of the unique factorization of integers into 
powers of primes, a multiplicative function is completely defined by prescrib- 
ing arbitrarily the values of {(p*) for all primes p and integers a21. In de- 
scribing such functions we therefore need to consider only the f(p*). 

Our problem is as follows: Under what assumptions does the sequence 
u,=f(n) have an asymptotic distribution function, and how is this function 
connected with the f(p*)? The results of this note in this direction are con- 
tained in the following 


THEOREM 1. Let a multiplicative arithmetical function f(n) satisfy the condi- 
tions 


(i) f(p*) >0, 


(ii) the series 


* This follows from an important theorem of Bochner: A continuous positive-definite function 
is of the form (5) with a non-decreasing w(x). The converse is obviously true. See Bochner [1], p. 76. 

t In fact Lévy, [7], p. 197, in deriving (2) from (6), assumes that (6) holds uniformly in every 
finite ¢-interval. That this additional assumption is not necessary was shown by Bochner, [1], p. 72, 
Theorem 21. Bochner’s statement proves that we can add suitable constants to our functions w,(x) 
so as to make them tend essentially to w(x), i.e., there is a sequence of constants c, and a sequence of 
functions y¥,,(x) such that 


+ on = w(x) + ¥n(x) and lim ¥n(x) =0 


at every point of continuity of w(x). From ¢,=w(x) —wn(x)+yWn(x) we derive for every point of con- 
tinuity of w(x) the inequalities 

w(x) — 1 S w(x) — lim sup w,(x) S lim inf ¢, S lim sup ¢, S w(x) — lim inf w(x) S (x); 
allowing here x— © and x—+— ©, we derive the result lim inf c,=lim sup ¢n=0, hence (2) holds. For 


our particular purpose (Theorem 1) Lévy’s restricted statement would suffice, for without more 
trouble we can prove that (16) (the analogue of (4)) holds uniformly in every finite ¢-interval. 


nwo 
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1 
(8) > 5 tog S(p)|| converges, 


Pp 


where, as a matter of notation, ||x\|=min (1, |x|). Then f(n) has the following 
properties: 
1. The sequence u,=f(n) has an asymptotic distribution function x(u) with 


x(u) = x(+ 0) = 0 for u <0, 
and the Fourier transform of x(e*)(— © <x< @) is 
si 1 1 
f = L(t) = {(1 =) (1 + exp [it log f(p)] 


p 


(9) 


+ [it log f(p)] + 


the infinite product being absolutely and uniformly convergent in every finite t- 
interval. 

2. The set of points of increase of the distribution function x(u) is identical 
with the sequence of points u,=f(n) together with the limit points of this se- 
quence. 

3. The distribution function x(u) is purely discontinuous if the series 
(10) converges. 


The function x(u) is everywhere continuous if there exists a sequence of increasing 
primes, Qi, Ja," with 
(11) (Qu) fq») for v, 
and such that 
(12) > — diverges.* 


* Davenport’s conditions (see [3], p. 10) for the existence of x(u) are as follows: 

(i) 0<f(m)S1, 

(ii’) there are two positive constants C and c such that 
(8’) 0 — f(p*) S Cp-™ 
for a= 1 and all primes p. 

One should remark first that Theorem 1 imposes no conditions whatever on the values of f(p*) 
for a>1, except condition (i). Moreover, for a=1 (ii’) gives OS 1—f(p) <Cp~ and this implies al- 
ready the convergence of our series (8). Hence all of the inequalities (8’) for a>1 are superfluous as 
far as the existence of the d.f. x(u) is concerned. 
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5. In order to prove the first part of Theorem 1 let us consider the infi- 
nite product 


L(t) = + p-“(exp [it log f(p)] — 1) 


13 
+ p-*(exp [it log f(p*)] — exp [it log f(p)]) + --- } = +4,). 


By means of the inequalities 
| sin x | || = min (1, | x | || xe | || max (1, | t | ), 


we have 


| ap| < p-*| exp [it log f(p)] — 1| 
+ p~*| exp [it log /(p*)] — exp [it log f(p)]| + -- - 


tl 
< sin +2(p?+ 


t 


2p-1||log f(p)|| max ) + 2(p — 1) 


= f(p9|| max (2, | t| ) + 2(p — 1)-*. 


For |¢| <7, the series }-a, is therefore dominated by the convergent series 


DX (p> |/log f(p)|] max (2, T) + 2(p — 1)-*) 


and the infinite product (13) converges absolutely and uniformly in every 
finite /-interval. As a further consequence of the last result we have 


+ exp [it log f(p)] — exp [it log ] | pe(m) | 


with 
pm) = [J (exp [it log f(p*)] — exp [it log ]) 


= exp] it tog 


d|m 


(14) 


where p* are the powers of different primes in the canonical decomposition 
of m and where u(m) is the Mébius function. From the convergence of the 
last series one readily derives the relation 


? 
| 
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1 n 
(15) lim — ps(m)| = 0. 


N 


By the inversion formula of Mébius, (14) implies 


exp [it log f(m)] = p.(d), 
d|m 
and therefore 


n 
[it log f(m) — pila) -—>|* 


nN m=1 d|m m=1 
m=1 ™ nN m=1 m=1 ™ 
pm) 


m=1 m m=1 


where R(x) =x— [x]. Using (15) we derive 


1 n 
(16) lim — = > = Lit), 
no N m=1 m=1 
where L(é) is the infinite product (13). Since L(é) is continuous, (16) and 
Lemma 2 show that the sequence {log f(m)} admits an a.d.f. w(x) whose 
Fourier transform is (5), hence the original sequence {f(m) } admits the a.d.f. 
x(u) defined as follows: 


(u) { 0 for u <0, 
u)= 
w(log for u > 0, 


with x(+0) =w(—o)=0. The remark that w(x) =x(e*) completes the proof 


of the first part of Theorem 1. 

6. Let us pass to the proof of the second part of Theorem 1. Jessen and 
Wintner ([6], Theorem 3) have proved the following general result. Let 
o:(x), o2(x), o3(x),--- be a sequence of d.f. such that the convolution 
W(x) =01(x) *o2(x) * --- *0,(x) converges essentially to a d.f. w(x); this 
d.f. w(x) is called the infinite convolution of the sequence {o,(x)}, and we 


write 
(17) = o1(x) * o2(x) * o3(x) 


Let generally S(¢) denote the set of points of increase of a d.f. ¢. Then S(w,) 
is the vectorial sum of sets (in the sense of Bohr) 


(18) S(@n) S(o1) + S(o2) + + S(en), 
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and S(w) is the limit of S(w,) in the following sense: A point ~ belongs to S(w) 
if and only if it is the limit of a sequence of points x, with x, ¢ S(w,). 

If the origin O belongs to all S(¢,), then S(w:) ¢ S(we) ¢ S(w3) ¢ - in 
virtue of (18), and now S(w) is identical with the closure of the ordinary 
limit of S(w,), i.e., a point belongs to S(w) if it belongs to some S(w,) or else 
is a limit of such points. Formula (9) shows that our d.f. w(x) =x(e?) is the 
infinite convolution of the sequence of d.f. ¢,(x) of Fourier transforms 


f e'=do,(x) = (1 — + pr! exp [it log f(p) | 
+ pr? exp [it log f(p?)] +--+) 


where /, stands for the vth prime; hence S(w,) is identical with the set of 
points log - - (a, 20) and the second part of Theorem 1 is es- 
tablished. 

7. Passing to the third part of Theorem 1 we remark that (10) implies (8). 
However, if (10) converges we need not consider (8) at all, for now the series 


[it log f(p)] — 1) 


is dominated by the convergent series 
297 
S(p)¥1 
for all values of t, which implies the uniform convergence of product (13) for 
all real ¢. The transform L(#) is therefore almost periodic and the d.f. w(x) 


is necessarily purely discontinuous. 
Let us now assume (8), (11) and (12) to hold. By (17) and (19) we have 


(19) 


= Og = (0. * * On-1* )* On = On. 


Denoting by &(Z), $.(£), ¢.(E) the set functions corresponding to the point 
functions w, $n, On, we have 


= f $(E — 


If w(x) is not everywhere continuous, there is a point x with &(x) =c>0. 
Now by (19), writing log f(p,) =A,, 


1 1 1 
Pn Pn 


t A Lebesgue-Radon integral. For notation and references see [6], §2. 


£ 
& 
Pp 

- .- 
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Since 0<¢,(E) <1, we may infer that 

$n(x) >c/2 for m sufficiently large, 
hence 


1 1 1 
@(x + An) = + )+ 
Pn Pn pe 


This result, however, leads to a contradiction, for ).@(x+A,), summed over 
all different \,, (with m large), is on one hand <1, on the other hand it is 
>(c/2)>-(p!—p.*), and this last series diverges by our assumptions (11) 
and (12).* 


III. FREQUENCIES OF CERTAIN CLASSES OF INTEGERS 


8. Theorem 1 applies with great ease to the multiplicative arithmetical 
functions 
f(n) = o(n)/n (¢(”) is the Euler function), 


f(n) = n/o(n) (o(m) = sum of divisors of ”),t 


for f(p) =1—1/p and p/(1+> ) respectively, and both series 


1 1 1 1 
— tog (1 +—) 

are convergent. Moreover both functions are everywhere dense in the interval 
(0, 1), as even their values of the form 


1 + dm 


where g:,--- , Ym are different primes, have this property.§ Since the f(p) 
are all different, the distribution functions are continuous. We can therefore 
conclude that the frequency F{o(n)>kn} of k-abundant numbers|| is con- 
tinuous and strictly decreasing for increasing k=1. 


respectively, 


1 1 
flags: 4m) =(1 and 


* The conditions (11), (12), and the above proof of their sufficiency are due to Dr. Jessen. My 
original conditions were more stringent. 

t Schoenberg [8]. 

t Davenport [3]. 

§ See [8], p. 194. 

|| Davenport [3], p. 830. 
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These results may be extended to various generalizations of Euler’s func- 
tion as well as to the function 


f(n) = n*/o,(n) = n*/ (s > 0). 


For 0<s<1, this last function is everywhere dense in (0, 1); if s>1 we have 
lim inf f(m) = 1/¢(s), lim sup f(m) = 1.T 


9. We now shall apply Theorem 1 to certain arithmetical functions lead- 
ing to purely discontinuous distribution functions. A few preliminary consid- 
erations are necessary. 

Let x1, %2, %3,- - - be a sequence of real elements. Let Ax, As, As, - - - be all 
the different values of the elements of the sequence {x,}, i.e., \y+), if ux¥v 
while any x, is equal to some X, and vice versa. Let us further assume that 
the sequence {x,} has an a.d.f. w(x) whose Fourier transform is 


(20) f = 
m=1 


hence 
(21) 


We shall need the following 
Lemna 3. If the sequence {,} has no finite limit point, then 
(22) F{x, = du} = Aa, 
(23) F { x, = | = Am, + Amy t+ Am, 


For if the interval - - Am+e is free of values \, (u¥m), thenAn+te 
are points of continuity (in fact points of constancy) of w(x), hence 


1 
lim — N(x, S Am + €)n = wW(Am + €) = w(Am + 0) 


and therefore 


1 1 
lim —{ N(x, S Am + €)n — N(x» S Am — = lim — N(x, = Am)n 


no no 


= w(Am + 0) — w(Am — 0) = Am, 
which proves (22). 


t Gronwall [5]; presumably f(m) is everywhere dense in ({—1(s), 1). 
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Let C,, be the class of integers v for which x,=\,. The totality T of all 
positive integers is thus decomposed as a sum of classes 
with F{Cn} = An, 
and (23) is equivalent to 
(24) F{Ca, + Ca, } = F{Ca,} +F{Ca,} 


From 


C=Cn, + Cn, + + (Cm, + + 
we get 


(25) F{C} sF{Cn,} +--- +F{Cm,-,} + F{Cm, + 
On the other hand 
+ t } SF {Cy + + } 
=1-—F{Ci} —--- —F{Cn,1} ko 
in view of (21). Hence, as k«, (25) gives 
F{C} SF{Cn,} +F{Cm,} +--+. 


This together with the obvious relation 


F{C} = F{Cu,} +F {Cus} 


proves (24). 

10. Let us devide the totality T of positive integers into various classes 
as follows. Call C(1) the class of square-free numbers. If ” is not square-free 
let git1g2--- ger (a1 >1,--- , a, >1) be the product of all the powers of 
primes in its canonical decomposition and call C(qgi1g.*2- - - g#*) the class of 
all numbers m having the same product q,*:- - - g#* of powers of primes in 
their canonical decomposition. 

W hat is the frequency F qs2- - - qr) } of the class C(q@1 qt2- - 
It can be immediately computed from (9) by specializing conveniently the 
function f(m). For all primes let 


(26) f(b) = 1, f(p*) = p* for a > 1. 
The series (10) is void and the a.d.f. w(x) of the sequence {log f(m) } is purely 


discontinuous. SinceII(1— ={(2)-! =6/z’, its Fourier transform given by 
(9) becomes 
+ This relation, which I owe to Dr. Jessen, is not true for every decomposition T=C,+C2+ ---, 


for even the relation F{Ci:+C2+ --- }=F{Ci}+F{C2}+ --- breaks down in the case of the de- 
composition T=(1)+(2)+(3)+---. 


4 
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1 
f e*'=dw(x) Il {1 + p(i + =) exp [it log p?] 


Pp 
1\- 
+ exp [it log p?] + --- 
6 (a>1) 1 —1 
(gg - - gar) qi 


ay a, 
.(1+-) exp [it log (qi: - - - 
4 


Since log f(m) =log (gi1g2- -- g#") (a>1) if and only if m belongs to the 
class C(qi*1g#2- - - g#*) and since the various values of log (g:71- - - g#") have 
no finite limit point (as logarithms of different integers) Lemma 3 shows that 


6 
qr 


By the same lemma we obtain the frequency of a sum of classes C(q:71- - - g#*) 
by simply adding the frequencies of the individual classes.f Feller and Tornier 


(28) 


determine the frequency of the class of numbers m which have an even num- 
ber of powers of primes in their canonical decomposition (loc. cit., p. 229). 
We may obtain their result directly from (28), for 


a a, 6 1 


reven 3? reven (q? = 1) ni (q? - 1) 


1 


+02} 


1 1 
2p-*). 

11. The derivation of (28) was essentially based on the fact that the se- 
quence {log f()} defined by (26) has an a.d.f. w(x) with the transform (27). 
It is of some interest to point out that this result may be derived by means of 
elementary properties of trigonometric polynomials only, without reference 
to Stieltjes integrals or the more refined theorems used in §1. To discuss a 


t This is Theorem 19 of Feller and Tornier, [4], p. 228. 
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somewhat more general situation than (26) let us assume (a) that the series 
(10) converges, (b) that the sequence {X,,} of the different values of the ele- 
ments of the sequence {log f(m) } has + as its only limiting point, in which 
case by renumbering the we may assume <An—O. 

We know that (10) implies that L(¢) is an almost periodic function with 
a Fourier expansion of the type 


(29) Lit) = > A meitm (4. > 0, = 1). 
m=1 1 


Another immediate consequence of (10) is that the limiting relation (16) holds 
uniformly for all real ¢. Hence if we write 


1 n 
nN m=1 


we know that these polynomials converge uniformly to L(t). From 
An=M.{L(#)e-‘®™} and the similar formulas for our polynomials we get 


(n) 
lim dn =Anm 


no 


and therefore (in the notation of §1) 


@,(x) = << > An = 


Am< Am< 


for every xj, \2,--- , for both of the sums involved in the last relation 
contain a finite number of terms only. But this relation means precisely that 
w(x) is the a.d.f. of the sequence {log f(x) }. 

12. A great number of examples of classes of integers could be indicated 
whose frequencies can be computed by the method used above. We shall 
discuss only two more examples already considered by Feller and Tornier 
({4], pp. 215 and 224). 

Let I be the class of numbers of the form n=q,*19f2- - - g#* (a>1), ie., 
if p|m then also p*|m. To compute F{T} let I’, be the class of numbers for 
which the above property (p| implies p?|) is required only for the first m 
primes 1, , Pm. Obviously  cT,,. Consider the multiplicative func- 
tion f(m) defined by 


S(p1) = f(Pm) = pm, S(Pm+1) = f(Pm+2) =--- mi, 
f(p*) = 1 for a> 1. 


(10) is fulfilled and (9) becomes 


« 

2 

(m = 1, 2,3,---), 


I. J. SCHOENBERG 


LW) = {1 — pr 2+ p>(1 — pz") exp [it log p,]}. 


v=1 


Since log f(z) =0 if and only if nc T,,,, (22) gives 


F{r} <F{r,.} = TI (1 — + which +0 as 


v=1 


hence F{T} =0. 

Let ki, ke, ks,- - - be asequence of positive integers some of which may be 
infinite and let K be the class of integers not divisible by any of the prime 
powers ph, means that there is no restriction at all 
with respect to pn). To determine F{K} let us assume first that the series 


(30) br 
v=1 
converges. Define a multiplicative function f(m) by 


S(b») = fp?) = = = 1, fot) = pF for a= k, 
= 1, 2,3,--> 


S(p)¥1 (ky=1) 


converges, the a.d.f. of {log f(m)} is a step function with the transform 


Lit) = Il {(1 — pr 
+ exp [it log +---)}. 


Since log f(m) =0 if and only if m ¢ K, we have 


v= 1 


(31) F{K} = Il {(1— prit--- + = JT] — pr%). 


If (30) diverges let K,, be the class of integers similar to K but for the new 
sequence of exponents hi, ko,--- , km, ©, ©,--+ ; then K cK,, and (31) ap- 
plied for the new sequence gives 


F{K} <F{K,,} = — which +0 as m— 
v=1 
Hence F { K} =0 and the formula (31) is again valid. 
Obviously our last two examples are also particular cases of the elemen- 
tary scheme discussed above. 
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Added in proof, January, 1936. In a recent note On the density of some 
sequences of numbers, Journal of the London Mathematical Society, vol. 10 
(1935), pp. 120-125, P. Erdés proves without using Fourier analysis two 
theorems which in our notation are as follows. 


1. If f(n) is a multiplicative function satisfying the conditions 
(32) f(m) 2 1, 
(33) LI|log f(p)||/p converges, 
Pp 


(34) S(b1) ¥ f(p2) if pi, p2 are different primes, 
the sequence {f(n)} has a continuous asymptotic distribution function. 


2. Lf the multiplicative function f(n) satisfying (32) is such that 
(35) f()||/p diverges, 
Pp 


then 
(36) F{f(n) =r} = 1 for any real r = 1. 


The first theorem of Erdés is obviously a consequence of Theorem 1. 
This is not true for the second theorem; I want to show, however, how it can 
be derived from Theorem 1 by a simple additional argument involving 
moments rather than Fourier transforms. 

Let f.(m) be an auxiliary multiplicative function defined as follows: 


Si(pe) = f(pF) (y= 1,2,---,&), 
(= 1). 
The sequence {log f.(m) } has an a.d.f. w,(x) and the a.d.f. of the sequence 


(contained within 0 <t<1) is therefore x:(#) =w,(—log #). For s>0 
we have 


(37) 


(1 — + exp [— s log f(p,)] +--+ )} 


v=1 
1 
-f f t*dx;,(t). 
0 0 


The product (38) tends to zero as k on account of (35). But 


(38) 


1 
lim = 0, for s >0, 


kw 0 


implies (see [8], pp. 175-176) x.(#)1 for 0<¢<1. Hence 


2. 
f 
= 
\4 
q 


I. J. SCHOENBERG 


F{f(n) 27} =F{f(™) sr} <r} 


and (36) is proved. last inequality for frequencies follows from 
f-(n) Sfi"(n) which is due to (32) and (37). 
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THE INTEGRAL REPRESENTATION OF UNBOUNDED 
SELF-ADJOINT TRANSFORMATIONS IN 
HILBERT SPACE* 


BY 
FREDERICK RIESZ anp E. R. LORCH 


INTRODUCTION 


In this note we are concerned with unbounded self-adjoint transforma- 
tions in Hilbert space. Denoting such a transformation by A we give two 
short demonstrations of the facts associated with the formula 


(1) A= f AdE,(A). 
That is, we establish the integral representation of A by means of the resolu- 
tion of the identity £,(A) corresponding to A (for definitions, see below). The 
facts in question for the case of bounded transformations have been known 
in substance since the appearance in 1906 of Hilbert’s memoir on integral 
equations. The unbounded case on the other hand has received attention only 
in recent years.} Each of the three methods given to establish (1) for un- 
bounded transformations is based to a certain extent on the theory of 
bounded transformations, indeed to the extent of making use of the trans- 
formation (A —i£)-1; but none of these methods exploits formula (1) for 
the bounded case and its immediate consequences systematically. Our pur- 
pose is to show that by doing so the proofs can be shortened considerably. 

The fundamental idea of the first proof is to establish the existence of 
an orthogonal system of closed linear manifolds sweeping out the whole space 
and such that A behaves like a bounded transformation in each manifold 
considered as a Hilbert space. Thus for each manifold, formula (1) is valid; 
the behavior of A in the original space is obtained from its behavior in each 
manifold by a simple synthesis. 

The motivating idea of the second proof is to introduce a bounded self- 
adjoint transformation D whose resolution of the identity is topologically 


* Presented to the Society, September 10, 1935; received by the editors September 26, 1935. 

¢ Cf. J. von Neumann, Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren, Mathe- 
matische Annalen, vol. 102 (1929), pp. 49-131; M. H. Stone, Linear transformations in Hilbert space, 
Proceedings of the National Academy of Sciences, vol. 15 (1929), pp. 198-200 and 423-425; F. Riesz, 
Uber die linearen Transformationen des komplexen Hilbertschen Raumes, Acta Litterarum ac Scien- 
tiarum (Szeged), vol. 5 (1930), pp. 23-54. 
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equivalent to that of A, or, in the language of the operational calculus, 
D is a monotone function of A. The particular function which we use is 
|| /(1+*). On first thought one is tempted to use the function arctan }, 
but the former function, being practically rational, leads to easier calcula- 
tions than the latter. 

The paper is divided into five parts. In $1 we repeat the definition of a 
self-adjoint transformation and establish directly from this definition a 
lemma of interest in the unbounded case. §2 is devoted to bounded trans- 
formations. We state precisely what facts, known for over twenty years, 
we use in the subsequent proofs. In §§3 and 4 we carry out the first proof. 
§5 is devoted to the second. 

1. The definition of a self-adjoint transformation; a lemma. We use the 
symbols A, B, C, etc., to represent linear transformations defined on a sub- 
set of Hilbert space § which is dense in § (the subset may be the whole 
space); the symbols Ax, Az, Ac, etc., represent the domains of definition of 
these transformations; and f, g, k represent elements of §. In later sections, 
functions of the real variable \ will be denoted by R(A), S(A), T(A), etc. We 
find with respect to a transformation A all pairs of elements [f, f*] satisfying 
the equation 


(2) (Ag, f) = (g, f*) for all geAa. 


The transformation A* defined by the equation A*f=f* is called the adjoint 
of A. It is easy to verify in (2) that A* is a linear transformation. In addition, 
A* has the property of closure or is a closed transformation; that is, if 
(n=1, 2,---), and if f,-f, A*f,—f*,t then and A*f=f*. This 
results from the continuity of the inner product (h, k). If A4=Ax« and if 
A =A* in this common domain, we say that A is self-adjoint. 

We shall derive from the foregoing a lemma which is of use in the sub- 
sequent discussion. It is convenient first to introduce a few definitions. Let 
M+ H be a closed linear manifold, and let A be a transformation defined for 
every element of I? and transforming M into a subset of itself. For our pres- 
ent purposes, A may or may not be defined outside of 9. An expression of 
the type “A is self-adjoint in I$” will mean that the transformation A con- 
sidered in the Hilbert space (or space of a finite number of dimensions) M is 
self-adjoint. The behavior of A outside of I does not in any way affect its 
characterization in M. 

Let Dt:, De, - - - be an orthogonal sequence of closed linear manifolds, 
that is, such that if fieM;, then (fi, f;) =0, ix7. We use the symbol }>7_,;M. 
to denote the smallest closed iinear manifold containing each M¢;. Evidently 


t The symbol f,—+f indicates that lim,-« || f—f,||=0. 
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> 2_1M. contains precisely all the elements of the form f+ - - - +f, and their 
limits. We have the 


Lemma. Let - - - be an orthogonal sequence of closed linear mant- 
folds such that >-2_,Ma=. If feG, let f; denote the projection of f on M;. Let 
A; be transformations which are self-adjoint in M;. Then there exists precisely 
one self-adjoint transformation A which is identical with A; in M;. The domain 
of A, Ax, consists of all fe such that 2_,||Aafal|’ converges; if then 
Af = 

We show first that the transformation A as defined above is self-adjoint. 
If feA4, then clearly A(f—f;) is orthogonal to any element in IN;. This means 
thet A* is defined in M; and equal to A in Mi, since we have for any feA, 
and geM; 


(Af, g) = (AY — fi), 8) + (Af, 8) = (fi, Ag) = 


Since A* is a closed transformation, Ay* 2A, and A*=A in Ay. Now let 
heAg*; then 


(3) (A*(h — hy), Ahi) = ((h — hy), A*hi) = 0, 
and hence 
= — hall’ + 


In turn 


Thus /eA,, and since A* is a closed transformation, Ah=A*h. 

To complete the proof of the lemma, we show that there exists only one 
self-adjoint transformation which is equal to A; in M;. Let B be any trans- 
formation having the requisite properties. Since B is a closed transformation, 
Az >A, and A =B in Ag. If heAg, we repeat the argument in (3) and what im- 
mediately follows, replacing A* by B, and show that heA,. Hence A and B 
are identical. 

2. The operational calculus for the bounded case. Below appears a résumé 
of known facts concerning bounded self-adjoint transformations which form 
the basis of the subsequent discussion; no proofs are given. 

A projection P is a self-adjoint transformation such that Ap= §, P?=P. 
A family of projections, E,, —° <A<©, is said to be a resolution of the 
identity if it possesses the following properties: (1) A.E,=£,£,=E,, \<u; 
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(2) E,--0 (the zero transformation) or E (the identity transformation) as 
or ©; and (3) 

A transformation A is said to be bounded if there exists a constant c such 
that || Af|| <c||f||, feAs. The least constant of this type is said to be the bound 
of A; it is denoted by Mx. If A is bounded and possesses the property of 
closure, it is easy to see that A, = §. If A is bounded and self-adjoint, it has 
the following properties: 

(a) Corresponding to A, there exists a unique resolution of the identity 
E,(A) enabling us to write equation (1); the integration here is of the Stieltjes 
type and is to be interpreted in an obvious manner. The boundedness of A 
is equivalent to the boundedness of the quadratic form (Af, f) under the 
condition that ||f||=1. If m and M denote the greatest lower and the least 
upper bound of this form, then £,(A) =0, u<m, and E,(A)=E£, The 
bound of A is the greater of the numbers | m| and | M|. The integral (1) may 
always be replaced by an integral over a finite interval containing the points 
\=m and \=M in its interior. In case E,,(A) =Em_o(A), and this will be the 
case with which we have to deal, we may write 


M 
A -{ AdE,(A). 


(b) If R(A) represents a real function continuous for mS <M, the in- 
tegral {“,,R(A)dE,(A) represents a bounded self-adjoint transformation de- 
noted by R(A). In particular if R(A) =1, R(A) =E. We point out that the be- 
havior of R(A) outside of the interval m<\ <M is not of consequence in de- 
fining R(A), since for such values of \, £,(A) is constant. 

(c) E,(A) and R(A) are permutable with every bounded transformation 
permutable with A. 

(d) For two continuous functions R(A) and S(A), 


f R(A)dE\(A) S(A)dE,\(A) = f R(A)S(A)dE,(A). 

3. The first proof: the transformations B and C. Let A be an unbounded 
self-adjoint transformation, that is to say, there does not exist a constant c 
such that ||Af|| <c||f||, for all feA4. By an ingenious device due to J. von 
Neumann,f we shall introduce the transformations B and C which are the 
imaginary and real parts respectively of the transformation (A —iZ)-!. Con- 


t Cf. J. von Neumann, Uber adjungierte F unktionaloperatoren, Annals of Mathematics, vol. 33 
(1932), pp. 294-310. Other methods for introducing the transformations B and C may be found in 
the papers referred to in the second foot-note on p. 331. 
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sider the set of all pairs {f, g} of elements in §. These pairs form a Hilbert 
space §’ if the fundamental operations are performed as follows: 


a{f,g} = {af,ag}, aacomplex number; 


{fis 81} + = {fit fo, gr + ge}; 
and 


({ fa, gx}, = (fay fe) + ge). 


Since A is linear and closed, the manifold of all elements { Af, f}, feAu, 
in §’ is linear and closed. The elements of its orthogonal complement are of 
the form {g,—Ag}. In fact, let {g, h} be orthogonal to each { Af, f}. Then 


({Af, f}, {g, h}) = (Af, g) + (f, h) = 0. 


Since A is self-adjoint, geA, and Ag=—A. 
Let he; we express the element {/, 0} in $’ as the sum of its two orthog- 
onal complements 


{h, 0} {Af, f} {g,—Ag}. 
This gives 


(4) Af+g=h, f—Ag=0. 


Clearly # determines f and g uniquely; in addition, the correspondences be- 
tween h and g and between # and f are linear. Let B and C be the transforma- 
tions defined by the equations Bh=g, Ch=f. The transformation A? is de- 
fined for all elements of the form Bh; the transformation A is defined for all 
elements of the form Ch. The formulas (4) may now be written in the form 


(5) AB=C 
and 
(6) AC =E-B. 


If AeA, then one may apply the transformation A to the formulas (4), 
which by the definition of B and C gives 


BAh=Ag=ABh, CAh=Af = ACh, 


that is, 
BA = AB and CA = AC. 


The two following equations may now be derived: 

(7) C? = CAB = ACB = (E — B)B = B — B? 
and 

(8) BC = BAB = ABB =CB. 
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The transformations B and C are self-adjoint. Referring to (4), and with 
an evident notation, we have in turn for B and C 
(Bhy, he) = (gi, A*ge + gz) = (A°gi + 81, g2) = (In, Bho), 
(Chi, he) = (fi, + ge) = (Afi, Age) + (fi, g2) = (Ai — Age) + (Ags, ge) 
= Age) = (hi, Che). 
The transformations B and C are bounded and Mg <1, McX1. This arises 


from the orthogonal decomposition of {h, 0} into {Af, f} and {g,—Ag}, 
which gives 


and so 


Furthermore, by (7), the transformation B = B?+C? is positive definite; 
more precisely, 


(9) 0 S (Bh, Bh) + (Ch, Ch) = (Bh, h) S (h, h), 


and the case (Bh, h)=0 can arise only if g=Bh=0, f=Ch=0, hence 
h=Af+g=0. 

4. Conclusion of the first proof. Since Bk =0 implies h =0 (see (9)), E,(B), 
the resolution of the identity of B, satisfies the equation Z,(B) =0. This is 
an immediate consequence of the interpretation of the integral (1) where A 
is to be replaced by B. Let M, (m=1, 2, -- -) be the closed linear manifold 
of all elements of the form [E1/.(B) —E1jcn41)(B) |k, heS. Then by the first 
property of a resolution of the identity (see §2), the sequence Mti, Mes, - - 
is an orthogonal sequence. Referring to the first statement of this paragraph 
and to the fact that £,(B) =E, we see that }>2_,9. =. We now proceed to 
show that the transformation A is defined throughout Yt, and transforms 
this manifold into a subset of itself. 

Let F be any transformation permutable with £,(B) for all A. Then 


F[Eyn(B) — Eynt1)(B)] = [Eyn(B) — IF, 


and consequently F transforms M,, into a subset of itself. Specifically C and 
any function of B transform this manifold into a subset of itself (see (8) and 
§2(c)). Keeping the value of fixed, let R(A) be the continuous function equal 
to for 1/(n+1) SA S1/n and constant otherwise. Then 


R(B)B = BR(B) = E 
in M, by §2(b) in conjunction with the equations 
Eyjn(B)g=g, = 0, 
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for any g in M!,. In other words R(B) and B transform M, into itself in a 
one-to-one fashion and are inverses of each other in this manifold. Applying 
this to (5) we see that 

A = ABR(B) = CR(B) in M,. 


Thus A is defined throughout ,. Furthermore, since C and R(B) transform 
IM, into a subset of itself, A does also. 

In M,, A is a bounded self-adjoint transformation. For the sake of a 
simple notation, let us call the resolution of the identity of A in this manifold 
Ey,n. That is, E,,, represents a family of projections in the Hilbert space (or 
space of a finite number of dimensions) QM, having the familiar properties of 
a resolution of the identity. Let EZ, represent the family of projections in the 
space § which are identical with £,,,, in M, for every n. Then £) is a resolu- 
tion of the identity. The integral [°.,.AdE, represents a self-adjoint trans- 
formation.f Since for any geM,, Fxg =Ey,ng, this transformation is identical 
with A in M,,. Applying our lemma we obtain equation (1) as desired. 

5. The second proof. §3 in its entirety is » prerequisite for this proof. 

Consider £,(C), the resolution of the identity of the transformation C in- 
troduced above. Let I_ and MM, be the closed linear manifolds of all ele- 
ments of the form E,(C)k and [E—E,(C) ]h, he, respectively. Thus M_ is 
the orthogonal complement of IN,. Let D be the unique transformation linear 
in § which is defined by the equations 


=-E+B=-ACinM; D=E-—B=AC in M,. 


Since B and C are permutable, B and E,(C) are permutable by §2(c). Thus B, 
and hence also D, transform 9_ into a subset of itself; a similar statement 
can be made for IN,. Since D is self-adjoint when considered as a transforma- 
tion in M_ or in M,, D is self-adjoint in the entire space H. 

Equation (9) may be rewritten in the form 


0>—(h,h)+(Bh,h)>—(h,h), h#¥0. 


If we assume that /eMt_, the central expression of this inequality is precisely 
(Dh, h). Similarly, for h{~0)eM,, 


0 S (Dh, h) < (h, h). 
Thus for writing h=h,+/e, MeM_, 


t We consider the manifolds [E,—En_s|h, n=0, +1, +2, - - - . In each manifold the trans- 
formation defined by the integral /” ,\dE) is a bounded self-adjoint transformation. We define 
JS MEX to represent the unique self-adjoint transformation which in the manifold considered above 
is the transformation there given and whose existence is guaranteed by the lemma of §1. It is clear 
that the system of integers could be replaced by any other infinite set of numbers possessing no finite 
limiting value but extending along the axis of reals indefinitely in both directions. 
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— (h, h) = — (hy, hy) — (he, he) < (Dhi, hy) + (Dheo, he) = (Dh, h) < (h, kh). 

It follows immediately that Dh= —h or Dh=h implies h=0. Thus by §2(a) 
E_,(D) = 0, E,_o(D) = E, 


where £,(D) obviously represents the resolution of the identity of D. 

Let <a.n< +++ be a set of numbers con- 
verging on the left to —1, on the right to 1. Let 2; be the manifold of all 
elements of the form [Z,,(D) — E,,_, (D) |h, The sequence No, Ni, Na, - 
is an orthogonal sequence and ))7__.Na=9. 

Clearly D and B transform %; into a subset of itself. To prove that C 
has the same property, it will be sufficient to show that C and D are permuta- 
ble. If seM,, we have 


CDh = C(E — B)h = (E — B)Ch = DCh, 


since CheMN,. A similar remark demonstrates that CDh and DCh are identical 
for heIR_. Since any element in § may be expressed as the sum of two ele- 
ments, the one in 2t,, the other in M_, D and C are permutable. 

We now choose an 721 and hold it fixed. If eM;, then 


Bh= — A)dE\(D)h. 
-1 


Allowing R(A) to be the continuous function equal to (1—A)- for a;1SA Sa; 
and constant otherwise, we may write 


R(D)Bh = BR(D)h = h for he®;. 
Hence 
(10) = ABR(D) = CR(D), 


which means that A is defined throughout 9t; and transforms it into a subset 
of itself. 
From (7), we derive the equation 


C? = D—D? 
valid in MN; (¢=1). We assert that since (Ch, h) 20, heN:, 


1 
(11) Cc -f (A — in 
0 


More generally, we may state that if G is a bounded self-adjoint transforma- 
tion such that (Gh, h) =0, he, then there exists a unique bounded self-ad- 
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joint transformation F such that F?=G and (Fh, h) 20, he. For let F have 
the given properties; then 


(12) F= f = f 
0 0 
and 
dEgn(F) = dE,(G 
fu ©) 


since F?=G. Since there is but one resolution of the identity corresponding 
to a self-adjoint transformation, E,n(F) =E£,(G),0<u< ©. Referring to (12), 
we have F expressed as that function of G which corresponds to the numerical 
function p?; in other words, F is uniquely determined. This proof is valid 
in § and a fortiori in any manifold in which F is self-adjoint. Thus the rela- 
tion (11) is valid. 

From (10) we deduce 


R(A)(A — Ey (D) 
(13) 


- —— dE,(D) in N; (i = 1). 


Now let i<0. Keeping in mind that D= —E+B for this case, we see 
that if S(A) represents the continuous function which equals (1+A)~! for 
Sd Sa; and which is constant otherwise, 


BS(D) = S(D)B = E in %; (i < 0). 


Furthermore, since in this manifold (Ch, h) <0, 


0 
C= — (—A in Ni (i S 0). 


-1 
In view of (5) we obtain by a reasoning similar to that used above 
0 ay 
A= f — S(A)(— A — = f — S(A)(— A — A?) "/*dE,(D) 


(14) 


Let 7(A) be any real function continuous on the open interval —1<A<1. 
We define the symbol 
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1 
1p) = [ 


in such a way as to represent a self-adjoint transformation. The transforma- 
tion in question is that which in the manifold M; introduced above is given 
by the integral f."* ,7(A)dE,(D). The meaning of the latter has already been 
made clear since 7(A) is continuous over the closure of the range of integra- 
tion. T(D) is now defined to be the unique transformation whose existence 
follows from our lemma. Clearly, 7(D) does not depend on the choice of the 
numbers aj. 
In particular, let 7(A) be the monotonic continuous function defined by 
the equations 
= 
1+A 
(A 
0O<A<1. 


By (13) and (14), 7(D) =A in N;, Thus 


1 
(15) A=] T(d)dE(D). 
Since 7(A) is monotonic continuous, it possesses an inverse having the same 
properties, and we may write in (15) 


w= T(r), A=V(u), and E£(D) = £,(A). 


We thus obtain the customary integral representation of A (1). 

We may point out that formula (15) along with the equations of change 
of variable following it may be used to develop an operational calculus based 
on A. That is, if we assume that the operational calculus based on D has 
been elaborated, that based on A can be deduced in short order along lines 
which we need not describe here. 
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A FUNCTIONAL EQUATION IN ARITHMETIC* 


BY 
E. T. BELL 


1. Possible theories of arithmetical composition. The functional equa- 
tion to be discussed is that of associativity, 


(x, o(y, 2)) = o(6(x, v), 2), 


which occurs in all theories of numerical functions hitherto considered. The 
two most highly developed theories of this kind are those in which multiplica- 
tion in the ring of all numerical functions is abstractly identical with C 
(Cauchy, or D (Dirichlet) multiplication of infinite series. 

Lehmer’s five postulates are sufficient for the development of a theory of 
inversion as exemplified in the cases of C, D multiplication, without requir- 
ing, as is the fact in those cases, that the function ¢(x, y) (his ¥(x, y)) of com- 
position be a polynomial. For C multiplication, ¢(x, y)=x+y—1, instead of 
the usual x+y, as a change in notation justifies; for D multiplication, 
o(x, y) 

But these are not the only ¢(x, y) which give an arithmetical theory of 
composition (as in the papers cited); to mention only three further instances, 
there is von Sterneck’s “L.C.M. calculus,” quoted by Lehmer, in addition 
to the well known compositions ¢(x, y)=M/(x, y), where M is either “max” 
or “min.” 

It is of considerable interest then to see precisely what position is occupied 
in the general theory of composition developed in the paper (B) by the clas- 
sical theories in which multiplication is abstractly identical with either C or 
D. We shall prove that if $(x, y) is a polynomial in x, y, then, in order that the 
composition (x, y)=n, where n is an arbitrary constant integer>0O and x, y 
are variable integers >0, shall lead to an arithmetical theory of composition, 
it is necessary and sufficient that o(x, y) be either x+y—1 or xy, namely, that 
multiplication in the ring of all numerical functions be either C or D. 


* Presented to the Society, November 30, 1935; received by the editors October 29, 1935. 

t See the following papers in vol. 33 (1931) of these Transactions, where further references 
(especially in (A), (B)) are given. 

(A) R. Vaidyanathaswamy, The theory of multiplicative numerical functions, pp. 579-662. 

(B) E. T. Bell, Arithmetical composition and inversion of functions over classes, pp. 897-933. 

(C) D. H. Lehmer, Arithmetic of double series, pp. 945-957. 

The essential points for the arithmetical theory are the existence of a “unit function” and 
“inverses.” Lehmer gives a postulational treatment. 
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2. Associativity. Although only I, III, IV of Lehmer’s five postulates are 
required when ¢(x, y) is a polynomial, we shall give the full set to indicate 
the content of the theory. For the purposes of the arithmetical theory it 
suffices to restrict x, y, 2,---,m,--- to be integers>0; when this is not 
assumed, x, y, 2,--- are arbitrary elements of a field. Lehmer’s postulates 
are as follows. 


PostuLateE I. For each integer n>0, o(x, y) =n has only a finite number of 
integer solutions (x, y), x >0, y>0. 

PostuLateE II. For all integers x, y>0, d(x, y) =o(y, x). 

PostuLateE III. For all integers x, y, z>0, 2)) y), 2). 

PostuLaTE IV. If n, x are integers >0, o(x, 1) =n implies x =n. 

Postutate V. If nis an integer >0, and d(n) denotes the greatest value >0 of 
x (or of y) for which o(x, y)=n, then, for each integer m>0O, the equation 
d(n) =m has a unique solution n, and d(1) =1. 

Having obtained the general polynomial solution of the functional equa- 
tion in Postulate III in which, first, «, y, z are complex numbers, we shall then 
show that when 2, y, z are restricted to be integers >0, Postulate II is super- 
fluous (for polynomial solutions) when Postulate I is assumed, and that 
Postulate IV then suffices to isolate either Cauchy or Dirichlet multiplication 
as the composition ¢(x, y), so that, in the case of ¢(x, y) a polynomial, Postu- 
late V is superfluous. We shall prove first 


THEOREM 1. The only polynomial solutions of 


(1) o(x, z)) = (G(x, 9), 2) 


in the domain of complex numbers* are the unsymmetric solutions 
(2) (x,y) =x, 

and the symmetric solution 

(3) o(x,y) =a+ + y) + cxy, 

in which a, b, c are any constants such that 


(4) b> —b—ac=0. 


Note that Postulate I, which is necessary for the arithmetical theory, ex- 
cludes the solutions (2), also the solution (x, y) =a, included in (3), which 
appears in (7) below as one of two possibilities consequent on the assumption 
that $(x, y) is a polynomial. 


* This can be extended to any domain of integrity. For the interpretation of the solutions (2), 
see Lehmer, loc. cit., p. 946. 
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Let $(x, y) be a polynomial of degree r=0 in the complex variables x, y 

with constant term c. Then we may take 

(5) (*,0) y) =ctbhy+--- 

In (1) take x =y=0. Then 

(6) $(0, z)) = d(c, 2), 


identically in z. By (5) the right of (6) is of degree r in z; the left is of degree 
r?, and the coefficient of 2” is b,"+!. Hence, unless r?<r, it follows that b, =0. 
But r’?<r is impossible, r being an integer >0. Thus r=0 or 1. Hence either 


(7) $(x, y) =o, 
where c is an arbitrary constant, or 


(8) y) bu + ky + pry, 


where the constants a, b, k, p are to be conditioned so that this ¢(x, y) satis- 
fies (1). A short reduction of the result of substituting $(, y) as in (8) into (1) 
gives (4) as a necessary and sufficient condition that (8) be a solution of (1). 


Coro.iary. If y) is a polynomial in x, y, Postulates I, imply 
Postulate I. 

3. Exclusion of all but C, D multiplication. Everything will be proved 
as stated for polynomial composition ¢(x, y) satisfying Postulates I-V when 
we prove 


THEOREM 2. The only $(x, y) as in (3), (4) satisfying Postulate IV are 
(9) (x,y) =xty—1, o(x, = xy. 


For Postulate I rejected the solutions (2); the Corollary in §2 rendered 
Postulate II superfluous when ¢(x, y) is a polynomial; and if either of (9) 
holds, Postulate V is automatically satisfied. 

To prove Theorem 2, we note that if, in accordance with Postulate IV, 
¢(x, y) as in (3) satisfies the condition that ¢(x, 1) =m implies x =n for all 
integers n>0, then 


a+b=n(1-—b- 0), 


for all integers >0, and hence a+b =0, 6+c =1. (Otherwise: take =1, 2, 3, 
and get the same conditions.) Thus a= —b, c=1—b, and these values of a, c 
satisfy (4). Hence 


(10) y) = —b+ + y) + (1 — b)xy, 


in which the constant b is to be determined. By the definition of composition 
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and the postulates, ¢(x, y) is to be an integer >0 for all integers x, y>0. The 
condition is satisfied if b=0 or 1, giving 

(11) y) = xy. 


To see that no solutions other than (11) are possible, let b=1+4, h>0. 
Then, from (10), 


o(x, y) = (1+ h)(x + y — 1) — hey, 


which is negative (h=1) if xy >2(x+y—1). This inequality is obviously satis- 
fied for an infinity of pairs of integers x, y>0. 
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A LOCAL SOLUTION OF THE DIFFERENCE 
EQUATION Ay(z)=F(z) AND OF RELATED 
EQUATIONS* 


BY 
I. M. SHEFFER 


Introduction. We propose to consider certain aspects of the equation 
(1) Ay(x) = y(x + 1) — y(x) = F(x), 


and of other equations to be mentioned. There have been three principal lines 
of study of (1): (a) that relating to special classes of functions F(x); (b) that 
based on the character of F(x) at infinity; and (c) that making a local study 
of the equation. | 

As an illustration of (a) is the theorem (established, by different methods, 
by Guichard,t Appell, Hurwitz, Carmichael) that if F(x) is an entire func- 
tion, then an entire function solution y(x) exists. Again, if F(x) is mero- 
morphic, then a solution y(x) exists which is also meromorphic (Hurwitz, loc. 
cit.). 

Concerning (b): If the process of iteration be applied to (1), there arise 
two well known formal solutions{ 


(2) y(x)~ F(x — 2), 


Unfortunately these series are too often divergent. But by the introduction 
of suitable exponential convergence factors, Nérlund has shown§ that the re- 
sulting series will converge for a large class of functions F(x). 

A local study of equation (1) was made by Guichard (loc. cit.) who set 
up a solution in the form of a definite integral (cf. Nérlund, loc. cit., p. 38). 
This integral, however, has the unfortunate feature of representing (in gen- 


* Presented to the Society, September 6, 1934; received by the editors May 24, 1935, and, in 
revised form, October 25, 1935. 

ft References are to be found in Nérlund, Differenzenrechnung, 1924, especially pp. 38-39 and 
Bibliography; and in Carmichael, American Journal of Mathematics, vol. 35 (1913), pp. 163-182. 

t Cf., for example, Batchelder, Linear Difference Equations, Harvard University Press, 1927, p. 6. 

§ Loc. cit., pp. 40-43 for a real variable, and pp. 68-71 for a complex variable. 
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eral) an infinitely multiple-valued function. Some extensions of the local 
theory have been made by Carmichael (loc. cit.), using the Guichard integral. 

Part I of the present paper is largely devoted to the local theory of equa- 
tion (1). We obtain a solution in various forms, including a definite integral 
and a series of polynomials. In §1 we give a general formal solution which 
includes (2) and (3), and state a sufficient condition under which convergence 
takes place in an infinite strip. The local study begins in $2 where by use of 
the Pincherle integral* we show that if F(«) is analytic in |x| <r with r>3, 
then an analytic solution y(x) exists in the neighborhood of x =0. 

In §3 we consider the case where F(x) is rational. Such functions have been 
treated before, but the particular form that we obtain for the solution (as a 
definite integral) is needed in the next section, §4, where we introduce the 
polynomials {(x+1)"—x"}. Solutions of equation (1) are obtained as series 
in these polynomials, by various methods. Ultimately we are able to estab- 
lish that every function F(x) which is analytic about x = —3} has a convergent 
expansion (not however unique) in these polynomials; and corresponding to 
such a function F(x), a solution is found. 

Part II carries the methods of Part I over to the more general equation 


(4) L[y(x)] = ary(x + wi) + + + wx) = F(x). 


In recent years this equation has been the subject of several investigations 


appearing in Acta Mathematica.} The point of view is that of Nérlund’s prin- 
cipal solution, depending on the character of F(x) at infinity, and using sum 
formulas. Our work consists in a local study of the equation, thus leading to 
results essentially different from those found in the papers mentioned. 


Part I 


1. A general formal solution. We seek a solution of equation 


(1.1) y(x + 1) — y(x) = F(x) 


* Our form of this integral is that used by Borel in his method of analytic continuation. 

+ S. Bochner, vol. 51 (1928), pp. 1-21: Hauptlisungen von Differenzengleichungen. 

R. Raclis, vol. 55 (1930), pp. 277-394: Solution principale de l’équation linéaire aux différences 
finies. 

M. Ghermanesco, vol. 62 (1934), pp. 239-287: Sur les éqguations aux différences finies. 

For a real-variable treatment, consult S. Bochner, Vorlesungen iiber Fouriersche Integrale, 
Leipzig, 1932, chapters 5, 6. 

Closer in spirit to our own work are the following important contributions: 

Pincherle, Sur la résolution ..., Acta Mathematica, vol. 48 (1926), pp. 279-304. (This is a 
translation of an Italian memoir 0! 1888.) 

Carmichael, Systems of linear difference equations ..., these Transactions, vol. 35 (1933), pp. 
1-28, and Summation of functions ..., Annals of Mathematics, vol. 34 (1933), pp. 349-378. 
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in the form* 


(1.2) y(x) = >> L,(x)F(x + 2). 


On substituting in (1.1) we find that 
1, 


Let Lo(x) be arbitrary. Then L_,(x)=1+Lo(x—n), L,(x)=Lo(x+n), 
n=1,2,---,sothat 


THEOREM 1.1. A formal solution of (1.1) is given by 
(1.4) y(x)~ t+ +n) + [1+ — n) — 2), 
0 1 


where L(x) is an arbitrary function. 


Series (2), (3) of the Introduction are particular cases of (1.4), namely 
when L(x) =0, —1 respectively. The arbitrariness of L(x) allows considerable 
freedom, and suggests that for many classes of functions F(x), a choice of 
L(x) may be made to yield convergence for (1.4), in which case (1.4) will de- 
fine a solution of (1.1). In considering convergence, it is desirable that 
L(x+n) and 1+L(x—n) approach zero rapidly as n—« (x remaining in 
some bounded region). We now examine a simple choice of L(x). 

Lett 


(1.5) L(x) = —e*, 

and consider an infinite strip S parallel to the real axis: 

Ss: x=ut iv, u arbitrary, msv=sM, 

where m, M are such that cos 7#0 in mSv<M. Let R be any bounded region 


in S, and let x be in R. Since L(x+m) = —e-e**"8”, e—ie**"sinv, We have uni- 
formly inR 


(1.6) | L(x + eee (0<cScosv, mSvSM). 


On the other hand, 1+L(x—n) =|[e“"**—1]+e“"*”. The denominator ap- 
proaches 1 uniformly (x in R) as n—, and may be ignored. It can then be 


* Norlund (loc. cit., pp. 279-280) uses this device for a homogeneous equation of mth order. 
Consult also, in connection with the results of this section, E. Lindeléf, Le Calcul des Résidus, Paris, 
1905; especially pp. 52-62. 

t One can equally well replace e by a, a>1. 
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shown* that there exists a constant B (depending only on R) such that (uni- 
formly in R) 


(1.7) L(x —n)| Be- 


Relations (1.6) and (1.7) serve to establish 


THEOREM 1.2. Let F(x) be analytic except for isolated} singularities in a 
strip S: u arbitrary, <M, where c exists such that0<cScosvinmsvsM. 
Let S’ denote a region obtained from S by surrounding each singularity of F(x) 
by a circle and removing the interior of the circle, and let there exist four positive 
numbers r<e,a>1,C, D (depending on S’) such that for every S', and for all x 
in S', the following relations hold: 


(1.8) | F(x) | o>1,u>Q0; 
(1.9) | F(x) | < r<e, uso. 


Then series (1.4), with L(x) given by (1.5), converges for all x in S save (perhaps) 

for the singularities of F(x) and all points conjugatet (both right and left) to 
them. Moreover, (1.4) is a solution of (1.1) in S (save at the points already ex- 
cepted). 

Theorem 1.2 is a sample of a type of theorem obtainable by a judicious 
choice of L(x). Better theorems are surely possible, but it remains an open 
question if the ultimate theorem of this character can be obtained. It would 
state that whenever the rate of growth of F(x) is known (in an S strip) as 
u— +o, then a corresponding L(x) may be found such that (1.4) converges 
for all x in S save for certain points (and their conjugates) that are singulari- 
ties of F(x). 

2. A local solution as a definite integral. In this section we shall estab- 
lish, by use of the Pincherle (or Borel) integral and its inverse, that if F(x) 
is analytic in |x| <r where r>3, then a local solution exists. 

Let f(x) =) f.x"/n! be an entire function of exponential type p (exp. 


* By the Law of the Mean applied to in OS we get 0<t,<e™. Hence 
—1 = (1 + — 1. 
Since ef" <1, we may expand by the binomial theorem, getting 
c(C + 1) 
2! 
— Ae}-Crn, 
From this (1.7) follows. 
¢ The point at infinity may be a cluster point of singularities. 
t The points right (left) conjugate to x are the points x+1, x+2,--- (x—1,x—2,--- 


C = max| e“*iv?|in R, A = max e™); 
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type p); i.e., lim sup |f,|"’"=p, which we assume throughout to be finite. 
Then for every x, as is well known,* 
1 F(i/t 
(2.1) f(x) = — eas, 
2rido t 

wheret F(x) =)-of,«”, and C is a contour about ¢=0 whose minimum distance 
from ¢=0 exceeds p. 

Now the equation Ay(x) =e‘? has an obvious solution y(x) =e'*+(e'—1), 
t+ +2kmi. Applying this to (2.1) we gett 


Lemna 2.1. If f(x) is of exp. type p, then a solution of 
(2.2) Ay(x) = f(x) 


is given by 
1 F(1/t) 


2.3 dt, 


where C surrounds t=0, does not pass through any zero of (et—1), and lies at a 
distance exceeding p from t=0. 


As simple deductions from Lemma 2.1 we have 


Coro.Liary 2.1. If the maximum distance from C to t=0 is a, then y(x) of 
(2.3) is of exp. type not exceeding o. 


Coro.iary 2.2. If f(x) is of exp. type p, there is no solution y(x) of exp. 
type less than p, but there is a solution of exp. type p. Such a solution is given by 
(2.3) where C is the circle |t| =o, with Sp <o <2(k+1)z. 


We have found a solution of the difference equation (2.2) for functions 
f(x) that are of finite exponential type. This leaves out of consideration all 
entire functions of infinite exponential type, and all analytic functions that 
are not entire. We shall examine their case. 

Let F(x) =>of,x" be analytic about x=0, with radius of convergence r, 
and let f(x) =) f,x"/n! be the corresponding Pincherle entire function. (It 
is of exp. type p=1/r.) The two functions are not only related by (2.1), but 
also by the Borel (or Pincherle) integral 


* Pincherle, loc. cit., p. 285. 

t F(x) is analytic at least in | x| <1/p. We shall say that f(x) is the Pincherle entire function 
associated with F(x). 

t This result is to be found in Pincherle, loc. cit. See also Carmichael, Annals of Mathematics, 
loc. cit., pp. 361-367. 
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(2.4) F(x) = 


valid for every x in the Borel polygon for F(x). 

Now consider the equation Ay(«) =F (x). The trend of our argument is 
as follows: The function f(éx) is (in x) of exp. type | |p, so to it the preceding 
corollaries apply. We can then find a solution y(x; ¢) of Ay(x; t) =f(tx). As t 
varies we shall need to revise the contour C of (2.3). Passing over this (for 
the moment), we have, formally, 


F(x) = feasts: thdt=A pat], 
0 0 


so that a suggested solution of Ay=F(x) is y(x) = e-*y(x; é)dt. 
We can now validate the formal work. Divide the positive ¢t-axis: 0 <t< 
into the intervals 


| 2(n — St < 


For set 
1 F(t/u) 


.5) = — 
Cc, e*— 


du, 
1 


where C, is the circle |u| =2nzr,, the numbers r, being chosen to fulfill the 
following two compatible conditions: 

(i) c2r,2p, n=1, 2,--- , where o, p are numbers satisfying ¢>p>1/r. 

(ii) There is a number 6>0 such that no pair of numbers, one from each 
of the two sets {2mr},m=1,2,---, {2nar,},n=1,2,---, are at adistance 
apart less than 6. 

For ton J, and u on C,, we have | t/u| <1/r,<r, so that (by Lemma 2.1) 
(2.5) is a solution of Ay=f(tx). Again, this inequality tells us that | F(¢/u)| 
is bounded, uniformly in n, for t on I, and u on C,. This is also true* of the 
function |e“—1]|-'. Consequently y,(x; is of exp. type 


n 
< | u| = 2nnr, S ( 
n— 


t on J,; and | t)| < Ne, where N is independent of n, x, ¢, and 
r,(x) =max {real part of (ux)} for won Cy. 
We now define the function y(x; ¢) from ¢=0 tot=o : 


(2.6) y(x; t) = yn(x; 2), ton 


* This follows by a straightforward analysis. 


[May 
(n = 1, 2,3,---). 
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THEOREM 2.1. Let F(x) be analytic, |x| <r, and let r be any number in 
0<r<r. Choose o, p so that 1/r>o>p>1/r, and determine the circles C, and 
the functions y,(x; t) according to this a, p. Then the function 


(2.7) y(x) -f e~*y(x; t)dt 
0 


is analytic* in |x| <r. 


For if x is in |x| <7, then 


n n 
< 2ner,|x| < <( 
n—1 n—1 


where vy is chosen so that or<y<1. Therefore (2.7) converges absolutely 
and uniformly in |x| <7. We can write the integral as an infinite series: 


2n 
(2.8) yx) = >> e~*ty, (x; é)dt, 

n=1 2(n—1) 
which likewise converges absolutely and uniformly, |x| <7. Moreover, each 
term is an analytic function in |x| <r, thus making y(x) analytic in |x| <r. 

Now suppose F(x) =>>f,x" has a radius of convergence r>}. Choose r 

as any number in }<7<r. There exist certain points x in |x| <7 for which, 
also, |x+1| <r. Let R be an open connected set (e.g., a small circle) of such 
points. For x in R, we have from (2.8) 


Ay(x) = > e‘Ay,(x; t)dt = f(tx)dt -f e~*f(tx)dt, 
0 


1 2(n—1) 4 1 2(n—1) 4 


since Ay, =f(tx) on 7,. As the last integral is precisely F(x), we see that y(x) 
satisfies 


(2.9) Ay(x) = F(x), 
at least for x in R. Now y(x) is analytic in |x| <7, so that G(x) =y(x)+F(x) 
is also analytic there. Again, G(x) =y(«+1) in R, whence, by analytic con- 


tinuation, y(x+1) is analytic in |x| <7; and this makes y(x) a solution for 
all x in |x| <r. We thus have 


THEOREM 2.2. Let F(x) be analytic in |x| <r, r>}. To every + <r there cor- 
responds a function y(x), given by (2.8), which is analytic together with y(x+1) 
in |x| <r, and which satisfies (2.9) in |x| <r. 


The condition 7 <r prevents us from asserting analyticity of y(x) in the 


* |x| <r can be replaced by |x| <r by choosing r’ slightly larger than r. 
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larger region |x| <r. Inspection of the proof of Theorem 2.1 shows that the 
only purpose served by this condition is to guarantee that | F(t/u)| is uni- 
formly bounded, ¢ on J, and u on C,. There are cases when this is true even 
if =r, in which case Theorem 2.2 will continue to hold: 


Corottary 2.3. If F(x) is either analytic in |x| <r or is analytic and uni- 
formly bounded in |x| <r, then the contours C, of (2.12) can be so chosen* that 
y(x) as given by (2.8) is analytic in |x| <r, and is a solution of (2.9) for ail x 
in |x| <r. (Here r>}.) 

{t is even possible, in certain cases, to remove the restriction that 
| F(t/u)| be uniformly bounded: 

Coro.iary 2.4. Let F(x) be analytic in |x| <r, where r>}. For each p <r, 
define M(p) by 


(2.10) M(p) = max| F(x) |. 


|z|=e 


If there exists an increasing sequence of positive numbers {p,} with p»<r and 
lim p,=r, such that to every €>O there is a constant A=A, for which 


(2.11) M (pn) 


then y(x) as given byt (2.8) is analytic in |x| <r, and is a solution of (2.9) 


throughout |x| <r. 


The proof is quite straightforward, and may be omitted. 
From the preceding theorem and corollaries follows 


THEOREM 2.3. Let p(x) be a function for which a determination of log p(x) 
exists satisfying the hypotheses made on F(x) in Theorem 2.2 or in Corollaries 
2.3 or 2.4. Then the equation 


(2.12) + 1) — p(x)y(x) = 0 


has a solution analytic in |x| <r or |x| <r according to the case. 


In the theorems of this section it has been necessary to have r>}3 in order 
that the difference operator be applicable to (2.8). This condition appears to 


* We have only to replace condition (i) after (2.5) by (i) r2>1/r, with r,—1/r as n>. 

+ With the same modification as in Corollary 2.3, choosing rn=1/pn. There is, however, a point 
that needs clarification: Conditions (i), (ii) after (2.5) were compatible heretofore because the num- 
bers r, were capable of variation (within limits). In the present case the sequence 7, appears to be 
fixed, and conceivably condition (ii) is no longer true. However, we can always find a new sequence 
ps such that ps <pn, ps—r, and such that (ii) is satisfied. Since pf <pn, (2.11) also holds for pz , 
and we may set rn=1/p/. 
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be essential for the method of this section, but it is not inherently necessary 
for the problem, as will be shown in §4. 

3. F(x) rational. It is well known that if F(x) is a rational function, the 
equation Ay = F(x) has a solution that can be expressed in terms of the classi- 
cal* W-function and its derivatives. We shall then be brief in showing how a 
solution can be expressed as a definite integral. In §4 we have need of the 
particular case where F(x) = (x—a)-. 

Let F(x) =1/(x—a) = (x/ax)", The corresponding Pin- 
cherle entire function is f(x) = (—1/a)e*/«; and the work of the preceding sec- 
tion suggests that the functiont 


eft — 1 


(3.1) f “ct dt 
0 


e6 — 


is a solution off 


(3.2) Ay(x) = 


Set§ 8=1/a=a+ib, a0, and let p=0, for a<0, and p=a, for a>0. It 
is readily established that (3.1) converges absolutely for all x=u+iv in the 
half-plane (i) au—bv—(1+p) <0, and converges uniformly|| in any closed re- 
gion therein. Also, it diverges for every x for which the left member of (i) is 
positive. 

Let / be the line through a perpendicular to the segment Oa, and let / 
be the line obtained by translating / a distance one to the right. Then the 
half-plane of convergence (i) is determined by / or /™ according as a<0 or 
a>O. (In either case the origin lies in the half-plane.) On applying the differ- 
ence operator to (3.1) we find that for every x in the half-plane au—bv—1 <0, 
(3.1) is a solution of (3.2). 

Relations (3.1), (3.2) permit us to continue the function y(x; a) ana- 
lytically, giving 


* W(x) (x); cf. Batchelder, loc. cit., pp. 54-56. 

t We use (e*8—1) in the numerator rather than e‘*8 in order to avoid convergence difficulties 
at 

t For classical treatments of (3.2) (with a=0, which makes no essential difference), see Nérlund, 
loc. cit., pp. 99-103; Carmichael, Annals of Mathematics, loc. cit., p. 359; and Milne-Thomson, 
The Calculus of Finite Differences, London, Macmillan, 1933, pp. 247-248. It is of interest to compare 
their form of solution with (3.1) and (3.3), (3.4). 

§ The condition a0 can be removed by a translation of the independent variable. 

|| There is uniform convergence in the half-plane au—bv—(1+/) S —6 for every 5>0. 


1 
a 
1 
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THEOREM 3.1. For every positive integer n, the function y(x; a) of (3.1) has 
the representation 


n—1 1 et(ztn)p — | 
(3.3) y(x; B (a < 0) 


1 
3.4 (x;a) = - fe ————— dt > 0). 
(3.4) y(x; @) B (a ) 
The domain of convergence is a half-plane containing the origin and determined 
by the line ‘Ll (by the line +»), where ‘1 (1+) is obtained from | by a trans- 
lation n units to the left (n+1 units to the right). 


Coro.iary 3.1. [f a<0 (if a>0) the only singularities of y(x; a) are simple 
poles at the points x=a,a—1,a—2,--- (at the points x=a+1,a+2,---). 


Having (3.3) or (3.4) it might be thought that by using Cauchy’s formula 
we can now obtain a solution of Ay(«) =F (x) for any analytic function. But 
this seemingly hopeful line proves illusory,* and we are forced to modify this 
line of procedure. 

If F(x) is a rational function, a solution of the equation Ay=F(x) can 
clearly be expressed in terms of the function y(x; a) and its derivatives, 
where} a takes on values corresponding to the poles of F(x). Consequently 
the function y(*; a@) serves the same purpose as does the ¥-function. 


* (3.2) suggests that 
1 1 F(a)d 
Cc Cc 


Qridc 


so that the bracket will be an analytic solution of Ay= F(x). But it is not. For let C be a circle lying 
(say) to the left of the imaginary axis. Let F(x) be analytic in and on C. The integer can be chosen 
so large that the integral in (3.3) converges uniformly in x and a, for both x and a in and on C. 
This integral is therefore analytic in a in and on C, and on multiplying by F(a) and integrating 
over C, the contribution is zero. Hence for x interior to C, 


1 
V(x) = 0s; a) sah 


Now suppose for definiteness that C is of radius less than }. Then of the pointsa=x,x+1,x*+2,---, 
only x is in C, so that Y(x)=—F(x). Similarly, 


1 
y(x + 1; a) F(a)da = 0, 


since none of the points x+1, x+2,--- isin C. Hence while it is true that Y(x+1)—Y(x) =F(zx), 
yet V(x+1) is not the value of Y(x) when x is replaced by x+1. That is, we do not have an analytic 
solution. 

+ As stated before, the condition a~0 (where 8=1/a=a+ib) can be dispensed with. Further, 
if F(x) is not zero at infinity, it is the sum of a polynomial and a rational function that is zero there. 
For the polynomial a polynomial solution always exists, while for the remaining rational function 
the treatment outlined above will serve. 
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4. Solution in series of the polynomials {(x+1)*—x*}. In his proof that 
if f(x) is an entire function, then y(x), also entire, exists to satisfy Ay =f(x), 
Hurwitz (loc. cit.) began with the Bernoulli polynomials, which satisfy the 
relations AB, (x) =x" (B, of degree n+1). If, then, f(x) =) f,x", one is led 
to consider as a solution the series >. f,B,(x). Unfortunately, this series con- 
verges only for a limited class of entire functions. Hurwitz was therefore led 
to abandon a polynomial series, replacing it by a series of Bernoulli-Hurwitz 
functions. We wish to show in this section that there is a polynomial series in 
terms of which a solution for any analytic function can be expressed. The poly- 
nomials in question are {(x+1)"—x"}, m=1, 2,---. 


THEOREM 4.1. If the series 


(4.1) + 1)" — 


1 


converges* for x=2X9 wheret R(xo)* —4, then it converges absolutely for every 
x in the open regiont S common to the two circles |x| <po, |1+x| <po, where 
po=max (|x|, |1+2x0|); and converges uniformly in every closed region in S, 
thus representing an analytic function in S. 


The theorem follows readily on writing the series in the form 
[en { 1)"—ag" } ]-[{ + } J. 


THEOREM 4.2. A necessary and sufficient condition that series (4.1) converge 
for at least one x with R(x) # —} is that lim sup |c,|/"<2. 


(a) Suppose lim sup |c,|/"<2. Then 6 exists in 0<6<2 such that 
|cn| <(2—6)* for all sufficiently large (say n=>N). Let e>0 be chosen to 
satisfy 2—6+¢<2, and choose x»= —(2—5+¢)-!. Then |x o+1| <|-o|, so 
that <2|xo|"(2—6)", n=N. Now |«o| (2—6) <1; hence 
(4.1) converges for x =x. (And R(x») ¥ —3.) 


* It suffices to have | ¢n{(xo+1)"—x? }| bounded. 

{ That the condition R(x9)#—} is essential is seen as follows. Let R(xo)=—}; then 
xo+1 =| x0| 0, xo=| and 
| xo |"[2é sin nwo], n even; 
| xo |"[2 cos nwo], n odd. 
If, then, wo is commensurable with z, there will be infinitely many values of m for which 
{ (xo+1)"—<x0"} =0. For these values of m we can choose cy as large as we please, with the result that 
there will be divergence of (4.1) for every x not on R(x)=—}4, while on R(x)=—} there can be 
points of convergence. 

t The region of convergence (except possibly for points on R(x) = —}) is then bounded by two 
circles of equal radii, with centers at 0, —1. 
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(b) Suppose (4.1) converges for x=xo, R(x) —}. By Theorem 4.1 we 
may assume that >|xo+1|. Then |xo| >3, say |xo| =5+4, 6>0. If 
lim sup |c,|/"22, then an e>0 exists for which both (3+6)(2—«)>1 and 
|c,| >(2—e)* for infinitely many values of n, say n=m, mo, - - - . Conse- 
quently, 


| Cn, { (xo + 1)" | > 3(2 | xo (i sufficiently large). 


As the right hand member approaches infinity with 7, series (4.1) fails to 
converge at x=». This contradiction establishes the theorem. 
The same type of argument serves to prove 


Coroxary 4.1. Let lim sup |c,| "=p <2. Then series (4.1) 

(i) converges absolutely for every x in the region common to |x| <1/p, 
|1+2x| <1/p; 

(ii) converges uniformly in every closed region therein; 

(iii) diverges for every point x exterior to this region and not lying on the line 
R(x) = —}. 

For convenience we shall say that (4.1) converges only if it converges for 
at least one x not on R(x) = —}; and if it converges in the region common to 
the circles |x| <r, |1+«| <r, we shall speak of such a region as a lens-region 
of convergence of radius r. 


Expansions of type (4.1) can be related to solutions of equation (1). It is 
evident that the function e'* (t a parameter) has the expansion* 


(4.2) (t + 2kzi), 
— 1) 

convergent for all x, and uniformly convergent in x and ¢ in any bounded x- 
region and any bounded #-region excluding arbitrarily small neighborhoods of 
the points ¢= + 2k7i. 

Series (4.2) can be integrated (in #) over any contour that avoids the 
points +2kzmi. By use of (2.1) this tells us that if f(x) is of exp. type p, then 
it has the expansion 


3 F(i/t 
1 c et 1 


n! 2nt t 


* If (4.2) is multiplied by (e*‘—1), then no values of ¢ need be excluded. But now on setting 
t= +2kmi, we get a uniformly convergent expansion of the function zero. Hence an { (x+1)"—x"}- 
expansion is by no means unique; there are infinitely many expansions of zero: 0=)oen{ (x+1)"—2"}, 
each one corresponding to a function > .,x" that is periodic (with unity as a period) and analytic at 
least in |x| <4. 
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uniformly convergent in every bounded region. (Here C is a contour about ¢=0, 
not passing through any point ‘= + 2kzi, and with minimum distance from 
t=0 exceeding p.) 

We now come to F(x), assumed to be analytic in |x| <r, r>4. Let (x) 
be the solution (2.8) of equation (2.9). As (2.8) converges uniformly in |x| <r, 
y(x) has a power series expansion (x)=). A,X", whose coefficients \, can 
be obtained by finding the corresponding coefficients in each function 

"n(x; #)dt and adding. Making use of (2.5), we get 
2kr n—1 
(4.4) m=—> f 
M! poi J — 

Since >-A,.x" converges in |x| <7, so will y(x+1)=)>-\,(x+1)* converge in 
|~+1| <7, and as we can choose r >} (since r>}), the two power series for 
y(x), y(x+1) (whose difference is F(x)) have a common domain of conver- 
gence. We accordingly have 


THEOREM 4.3. Let F(x) be analytic in |x| <r, r >}, and let r be any number 
in <r<r. Then F(x) has the expansion 


(4.5) F(x) = Saaf(e + 1)" — 24}, 


convergent in a lens-region of radius at least r. The coefficients d, are given by 
(4.4), and y(x) is a solution of Ay(x) =F (x) in the lens-region. 

It is perhaps clear that Theorem 4.3 cannot be the best theorem relating 
to {(x+1)"—x*"}-expansions, and for two reasons: F(x) is here referred to 
the origin (x =0) whereas Theorem 4.1 tells us that the central point in such 
expansions is the point x= —}; and an undue restriction is placed on the 
radius of convergence of F(x). The result of §3 will allow us to remove the 
non-essential conditions. 

Consider again the function y(x; a) of (3.3), a lying to the left of the imagi- 
nary axis. The only singularities of y are at x=a, a—1,---. Hence when 
|a| >3, y is analytic (about x=0) in a circle of radius greater than 3; and 
we can write 


(4.6) a) = |(a| > 4). 
0 


If & lies in a region R such that |a| p>} uniformly in R, then (cf. (3.3)) 
series (4.6) converges uniformly in x and a@ for @ in R and x in |x| <p’ where 
p’ is any number less than p. The functions c,’ (a) are analytic in R. 

The radius of convergence of (4.6) is no longer greater than } if |a|<}. 
In this case let us proceed as follows: Define 2 by 
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(4.7) Q(x; a) = r cot r(x — a). 


The only singularities of this function are simple poles at x=a, a+1, 
a+2,---,andatx=a+mits principal part is (x—a Also, AQ(x; a) =0. 
Hence the function 
(4.8) V(x; a) = y(x; a) + Q(x; @) 
satisfies (3.2), and its only singularities are simple poles at the points 
x=a+1,a+2,---. Hence if 6>0 is sufficiently small, then for all a satisfy- 
ing both |a| <}+4, |a+1| =3+4, the nearest singularity of Y to the origin 
is x =a+1;s0 that 
+és 1 
0 


6 sufficiently small, 


the series converging uniformly in x and a for x in |x| <6’(5’<}+6) anda 
in the domain already described. And in this domain, c,’’ (a) is analytic. 

Let & be a lens-region of radius r=}+6, and let A, B be the points in 
which the boundary C(of £) meets the line R(x) =—}, A being the lower 
point. When C is traversed in the positive sense, tw arcs are described: 
AB, BA. Now AB(BA) lies in the a-domain for which (4.9)((4.6)) holds, so 
we have 


Lemma 4.1. The function (x—a)-! has the {(x+1)"—x"}-expansions 


of’ (a) { (x + 1)" — xn}, 


(4.10) 
én (a) { (x + 1)" — 2}, 


n=0 
uniformly convergent in x and a, for x in every lens-region of radius less than 
4+6, and a on arcs AB, BA respectively. 


Now let F(*) be a function analytic about x= —4, so that there is an 
open region R of analyticity containing the point —4. A lens-region £ can be 
chosen small enough so that it lies in R and also so that for it Lemma 4.1 is 
valid. Let the boundary of £ be C=C’+C”’, where C’, C’’ are the left and 
right half arcs. Then in £, 


1 F(a) 
P(x) = — 
QridJca—x 2Qri 


the convergence being uniform in every closed region in £. We thus have 


1 
a n=0 


1936] A LOCAL SOLUTION OF A DIFFERENCE EQUATION 359 


TueoreM 4.4. If F(x) is analytic about x= —}, it has an {(x+1)"—x"}- 
expansion valid in some lens-region £: 


F(x) = faf{(x +1)" — 


(4.11) 
tn Cn (a)F (a)da (a)F(a)da; 


TL c’ 
and the series y(x) => ,°fnx" converges in a circle of radius exceeding 4, and in 
this circle y(x) is a solution of Ay(x) =F (x). 
If we combine Theorems 4.1 and 4.4 we can state 


THEOREM 4.5. A necessary and sufficient condition that F(x) have a (con- 
vergent) {(x+1)"—x"}-expansion} is that F(x) be analytic about x = —3. 


The special character of the point —} is not inherent in the difference 


equation, as a translation shows. We thus get the following more general 
statement: 


THEOREM 4.6. If F(x) is analytic about the point x =a, there is a function 
y(x), analytic about x =a+} in a circle of radius exceeding 3, which in this circle 
satisfies the difference equation Ay(x) =F (x). 

Part II 


We shall now apply some of the methods of Part I to a more general 
equation. We shall give a local theory for the equation 


L[y(x)] = + + + we) +--+ + + wx) = F(x), 


k>1, where no @; is zero and the w’s are all distinct. The a’s and w’s are com- 
plex constants (some or all of which may be real). 
5. Some geometric lemmas. We define p;(x), p(x) by 


(5.1) p(x) =|x+,;| 
(5.2) p(x) = max {pi(x),--- , px(x)}. 


The function p(x) is continuous, and since p(x) as |x|, it therefore 
has a minimum value p*: 


(5.3) p* = min p(x). 


{ It would be of interest to determine the largest possible lens-region of convergence for a given 
function F(x). 
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Lema 5.1. There is a unique point} x =x* where p(x) takes on its minimum 
value p*. 


An equivalent geometric statement is the following: Of all closed circles 
C covering the set Pi, -- - , Px, where P; stands for the point —w,, there is one 
and only one of smallest radius. We know there is at least one. Suppose there 
are two (say Ci, C2) with radius p*. Then P,,--- , Px, being covered both 
by C; and C2, must lie in the closed zone common to C;, C2. But the circle C, 
on the common chord of Ci, C2 as diameter, covers this zone and therefore 
covers P;,--- , P,; and its radius is less than p*. This contradiction estab- 
lishes the lemma. 

Consider any two of the k& points, say P;= —w;, Pn = —wm. Let Lim be 
the perpendicular bisector of segment P;P,,. It defines two half-planes, and 
in that one containing —w; we have pm(x)>p,(x), while on jn we have 
Pm(x) =p;(x). Let 7 be fixed, and draw the lines Lj, ---, 1;,. If for each line 
we choose that half-plane not containing —w,;, then the set of points com- 
mon to all such half-planes constitutes the complete set S; at every point of 
which (and nowhere else) p;(x) is the only pn(x), m=1,---, k, having the 
maximum value p(x). On the boundary of S;, p;(x) shares this maximum with 
at least one other py. We denote the boundary by B;. (One readily establishes 
that B; is a convex polygon with its end sides running off to infinity.) 

We turn now to two lemmas that will be needed in §7. 


Lemna 5.2.t Let 1’, 2’, 3’ be three points on a circle C, so situated that 2’ and 
3’ are on opposite sides of the diameter through 1’, and such that chords 1'2' and 
1’3’ make angles less than 45° with this diameter. Form a lattice work in the 
plane with 1'2' and 1'3’ as adjacent sides of a lattice parallelogram. Then of 
all the lattice points in the plane, only 1', 2’, 3’ are in or on C. 


Lemna 5.3. In the figure of Lemma 5.2, let 1, 2, 3 be the points diametrically 
opposite 1’, 2’, 3’. Let Q be any point in or on C but distinct from any one of the 
six points 1,--- , 3’, and translate the lattice work of Lemma 5.2 so that Q is 
a lattice point. If in this new lattice work there are lattice points other than Q in 
or on C, then the only possibilities are the following: 


ft It may be asked where the minimum point x* lies. The following facts are readily ascertained. 
If there are exactly three points, P;, P2, P3, then (i) x* is the center of the circumscribing circle, and 
p* its radius, if AP;P2P3is an acute-angled triangle; and (ii) x* is the mid-point of the longest side, and 
p* is half that side, in the contrary case. Now consider the general case of k points: Pi,---, Px. 
Let K be the smallest convex polygon covering this set. Each triad of vertices of K determines an 
x and a p, according to the rule just stated, and among these pairs (x, p) is to be found the mini- 
mizing (x*, p*). 

t The proof is quite elementary, and the lemma has been sent to the American Mathematical 
Monthly, to be proposed as a problem. 


a 
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(i) There are exactly two lattice points, Q and R, where QR is parallel and 
equal to 1'2’ or parallel and equal to 1'3’. 

(ii) There are two or more lattice points, all lying on a line | through Q, paral- 
lel to the shorter diagonals of the lattice work; and on | these points are consecutive 
lattice points. 


Through O (the center of C) draw a line parallel to 1’3’, and on this line 
choose two points 3, x, so that x33’ =x,1’ =p. Draw circles C3, C, of radius p, 
centers 3, x4. C3 and C (C, and C) have in common a closed point set bounded 
by two arcs that intersect at 3’ and 1 (at 1’ and 3); call this region Ris: (R13). 
Q being in or on C, in order that a point R exist, in or on C, such that QR is 
equal and parallel to 1’3’ (to 1’2’) it is necessary and sufficient that Q be in 
or (in Ry or Ry2).T 

Let Q be in Ris. We are to show that R (in Ri;) such that QR is equal and 
parallel to 1’3’ is the only other lattice point in or on C. There are two circles 
C,, C2 of radius p, having QR as a chord. Join 3’ to Q by a straight line and 
imagine that Q is now a variable point starting at 3’ and moving along to its 
original position. This effects a continuous translation on C;; it starts in a 
position coincident with C, and ends in its original position. Now the farthest 
limiting position of Q from 3’ is at 1, and when Q is at 1, C; meet C in 1 and 3. 
Hence it follows by continuity that for any position of Q in Ris, (but not at 1 
or 3’), circle C, meets C once interior to the arc 13’ and once interior to the 
arc 1’3. A similar conclusion applies to C2. Now C;(C,) contains in and. on 
it the whole arc 13’ (1’3). Hence all of C (including boundary) is contained in 
C:+C.+C3+C, (including boundary). 

Let S be any point common to C, C; (including boundary). Then 
QS <chord 13’, which is less than the minimum distance between any 
two lattice points; hence S cannot be a lattice point. Similarly there are no 
lattice points in R,3 other than R. As for C,: If through R we draw a line 
parallel to 1’2’, it cuts C; in a point S, which és a lattice point. By Lemma 5.2, 
S:, Q, R are the only lattice points in or on C;. (Similarly, there is a point S; 
on C, such that S:, Q, R are the only lattice points in or on C2.) But S; and 
S» are exterior to C. For consider Sj, say. If it is in or on C, then since RS; is 
equal and parallel to 1’2’, therefore R must be either in Riz or Ry2; a contra- 
diction. 

This establishes the first part of the lemma. The second part likewise fol- 
lows by a geometric continuity argument. 


t Rw’, Ri’2 are the closed regions common to C and C; and to C and Cg respectively, where 
C;, Cs are circles of radius p and centers x5, xs, and where 5, %¢ lie on the line through O parallel to 
1’2’ and such that C;(C.) passes through 1 and 2’(1’ and 2). 
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6. The A,(x)-series and a solution of L[y(x)]=1/(«—a). We introduce 
the polynomialst 


(6.1) A,(x) = L[x"] = + oi)" + (n =0,1,---). 


The determination of an asymptotic expression of A,(x) is possible if no two 
of the quantities p,(x) =|x+w,|,j7=1,--- , k, give the maximum value p(x). 
Now we saw that the maximum p(x) is shared when and only when x is 
on a boundary B;. We shall therefore define the set of boundary points 
(B,+ --- +B,) as the critical set.f 


THeorEM 6.1. If the series (x) converges at x =x» (not in the critical 
set), it converges absolutely in the circular polygon Ay defined by 


(6.2) Ao: | « + < p( xo) G= 


and converges uniformly in every closed region therein, thus representing an ana- 
lytic function in Ao. The region Ao contains the point x=x*. 


The theorem follows from the fact that a constant m exists such that 
m|p(%o)|"<|An(xo)|, and that for x in any closed region in Ao, there exist 
numbers C, p’ such that | A,(x)| <Cp’", p’ <p(xo). 

THEOREM 6.2. A necessary and sufficient condition that the series >-5 ¢,A,(x) 
converge for at least one point not in the critical set is that lim sup | c,|/"<1/p*. 


‘(a) Suppose the series converges for x =x» not in the critical set. Then 
there is just one value j for which p;(x») =p(%o). Call this common value p’. 
If it is false that lim sup |c,|!/"<1/p*, then there exists an e€>0 such that 

c,| >(1/p*—e)” for infinitely many values of say m, --- . Now M ex- 
ists so that | A,(x0)| >Mp’", n> N (suitably chosen). Hence 


1 mi 
| CnAn,(Xo) | >M (> “ps. 
p 


But p’>p*, so that for ¢ sufficiently small the bracket is greater than unity. 
The series therefore diverges for x =x»; a contradiction. 

(b) Suppose lim sup |c,|1/"=0<1/p*. Then 6 exists on 0<6<1/p* such 
that |c,,| <(1/p*— 6)" for all Choose xo =2x*+~ where || <e, and (as 


t This set is a natural generalization of the polynomial set {(x+1)"—x"}. It is mentioned by 
Ghermanesco (loc. cit., p. 249), but he scarcely makes use of the set. Pincherle (loc. cit., pp. 282-284) 
points out the relation of Appell polynomials (of which {A n(x) } is an example) to certain linear func- 
tional equations (including the one w are studying). 

¢ The convergence problem on the critical set involves difficulties of a kind already encountered 
at the beginning of §4, the critical set there being the line R(x) = —}3. 
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is possible) such that x is not in the critical set. The point 2* is in the critical 
set; suppose it is on B;. Then 


1 n : n 
p 


On choosing « sufficiently small, this last series converges; hence so does the 
original at x =x». 


Coroxrary 6.1. Jf lim sup |c,|/"=o0<1/p*, then (x) converges ab- 
solutely for every x in the (open) circular polygon of radius 1/0; converges uni- 
formly in every closed region therein; and diverges for every x exterior to this 
polygon (save possibily for points in the critical set). 


We are concerned with the equation 
(6.3) L[ y(a)] = any(x + + + + wx) = F(x). 
From (6.3) we get 
(6.4) L[e'*] = Lie’, 
where 
(6.5) L(t) = aye*! +--+ + 


It follows that for every t for which L(t)#0, equation L|y(x)|=e' has 
y(x) as a solution; and the expansion e* { [t"A,(x) ]+ [n!L(t)]} 
is valid for all x. 

From this we get (cf. series (4.3) of §4) 


THEOREM 6.3.7 If f(x) is of exp. type X, then it has the expansion 


F(1/t) 


uniformly convergent in every bounded region.t 


Zeros of L(t) give rise to expansions of zero, corresponding to exponential 
solutions of the homogeneous L-equation. Thus, if L(¢,) =0, then e"* satisfies 
L[y(x)]=0, and zero has the convergent expansion 0 =) 9 (t"/n!)A,(x). 

To investigate the possibility of expanding an “arbitrary” analytic func- 
tion in an A,-series, we are led to introduce the equation 


t See Carmichael, these Transactions, loc. cit., pp. 1-7. 
t C is a contour about /=0, not passing through any zero of L(t), and with minimum distance 
from ¢=0 exceeding X. 
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1 
(6.7) Liy(x; a)] = — 
and the work of §3 suggests the solutionT 


ebtz 
(6.8) = 6 f 
It is indeed a formal solution, and, if convergent, will be a true solution. 
There will be a convergence difficulty at ¢=0 if L(0) =0. In this case, sup- 
pose t=0 is a p-fold zero of L(t). Then if we replace (6.8) by 
r=0 
(6.9) = — 
L(6t) 
the integrand is convergent at the origin-end of the interval. This may how- 
ever restrict the range of x for which convergence takes place at the other end. 
To avoid this we write 


p-1 
(6.10) y(x; =— ef 
0 


L(8t) 


ebtz 
— 8 f et. dt, 
T L(6t) 
where T is any fixed positive number (which, however, we choose as zero if 


L(O) #0). 
Define ¢;(a), by 


(6.11) = R(6w;) = R(w;/a) 
(6.12) o(a) = max ox(a), on(a)y. 


If for a given a, the maximum, o(a), is attained for only one value of j, then 
L(t) is of order e‘*‘®. But if two or more indices 7 give the maximum, the 
order of L(6#) may not be clearly determined. If possible, then, we should 
avoid such values of a. Now if a is of this character, so also is ca, where c is 
any positive number. We thus get a ray of values a (issuing from the origin) 


¢ Pincherle (loc. cit., pp. 289-295) uses an analogous integral to discuss the equation 
L|y(x)]=F(x) where F(x) is a function analytic at infinity. See also his pages 295-297 for an in- 
teresting integral and series in the special case that L(¢) has the form 
L(t) = (1 — — + (1 — 


[May 
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of this type. The set of such rays (of which there are only a finite number) we 
shall term the set of primary critical rays. 

It is readily establishedf that if P;, - - - , P, are the points of Lemma 5.1, 
then the primary critical rays are to be found among those half-lines (i.e., rays) 
drawn from the origin perpendicular to the segments P;P,,j#r=1,---,k. The 
primary rays divide the plane into sectors, and throughout each sector there 
is one and only one j for which o;(a) =a(a). We shall assume that a is not on 
a primary ray.{ Then, as has been pointed out, L(t) is of order e'*‘*), so that 
we have 


Lemma 6.1. For a not on a primary or secondary critical ray, the integral 
(6.10) converges for all x in the half-plane R(—1+8x—o(a)) <0. This half- 
plane is bounded by the iine perpendicular to the line joining a to the origin, 
meeting this latter line in the point [1+0(a) a; and is that half-plane indicated 
by an arrow from a pointing toward the origin. 


Let / denote the boundary linc of the half-plane of convergence. If we 
replace x by x+w; in (6.10), convergence takes place in a half-plane whose 
boundary line /; is obtained from / by a translation§ of —w;. In this way we 
get k half-planes bounded by parallel lines /;, - - - , ,. In that half-plane com- 
mon to all the & half-planes, the operator L can be applied to y(x; a), and the 
result is 1/(~—a). That is, 


Lemma 6.2. The function y(x; a) of (6.10) satisfies (6.7) in the half-plane 
R(—1+ 6x) <0, which is determined by that one of h,---, that passes 
through a. 


y(x; a) will remain a solution of (6.7) in every region into which it can 
be continued. We now examine such continuation. The point a not being on a 
critical ray, it must lie in a sector bounded by primary critical rays. In this 
sector there is one and only one index, say j = M, for which o;(a) =o(a); and 
of 1, - -- , lx, it is ly which passes through the point a. The remaining lines 
lie on that side of ly that is away from the origin. Denote by H; the half-plane 
(containing the origin) bounded by /;. The functions y(x+w;),7=1,---, k, 
are analytic in H;, and therefore all these functions are analytic in Hy. 


ft Cf. Pincherle, loc. cit., pp. 289-291. 

t Also, we must avoid those values a for which L(8t) vanishes somewhere on 0<i< ~~. If 
hi, 2, + ++ are the zeros of L(#), then no one of the numbers at, n=1, 2, - - - , is to be real and posi- 
tive. If we draw the ray issuing from the origin and passing through /,, and then reflect in the real 
axis, we get a ray on which a must not lie. Such a ray we shall term a secondary critical ray. 

§ The equation of the half-plane is R(—1+8x-+0;(a)—o(a))<0. The smallest value that 
o(a)—o,(a) has, j7=1, 2,- +--+, k, is O, so that the “smallest” half-plane is given by R(—1+8x) <0. 
This half-plane is determined by the line through a, and it contains the origin. 
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Let us solve (6.7) for y(x+wy): 


f k 

(6.13) y(x + wy) = aiy(x + 

au \X—@ j=1 
where the prime indicates that j= M is excepted. Now of the lines /;(j7#M), 
onet of them, say /,, is nearest to Jy. The right hand side of (6.13) is there- 
fore analytic in H,, save for a simple pole at x =a. That is, except for x=a, 
the region of analyticity of y(«+w,) has been extended from Hy to H,. This 
automatically extends the range of y(x+w,), except for a pole, from H; to 
H (as we shall term it), where H;™ is the half-plane defined by the line 
1{ , obtained from /; by the translationt wy—w:. At x =a+wy—w; (which is 
on /;), y(x+wj;) has a simple pole of residue 1/ay. Elsewhere in H} it is 
analytic. The half-planes H{” have Hy =H, in common. Also, for 7#M, 
they have H{” in common, and y(*+w;), 74M, is analytic in H,{ except 
that for some§ values of j there may be simple poles with residue 1/aw. 
This permits us to use (6.13) again. 

The result is to extend the region of analyticity of y(x+ws) (save for 
certain poles) by another translation of amount wy—.; and the same thing 
applies to y(x+w,). Let the boundaries of the new half-planes|| H be /;, 
obtained by the translation wy—, from /;”. As for the poles of y(x+wm) 
in H : we have already noted the simple pole at x=a. There is clearly a 
simple pole at x =a+wy,—wi, with residue —a;/ay?. The only other poles in 
Hy come from those values of j for which the pole a+w—w; of y(x+w,;) 
lies in Hy ; in which case a+w —w; is a simple pole with residue —a;/a,/. 
For each pole of y(x+w) in Hy we get a corresponding (simple) pole of 
y(x+w;) in H , with the same residue. 

This process can be carried out indefinitely, each time extending the re- 
gion by the translation ws; —«;. In general we have half-planes H{ , bounded 
by , where 


Ay” Ly™ 


Each time that a pole of y(x+w,), 74M, enters for the first time a region 


t There may be several lines coincident with /,; but no line /; (j#M) coincides with Jy. 

t The point x=a we know to be on/,. It is easily computed that x=a+wyy—w; is on 1;. The 
lines ;,- ++, J, can then be characterized as follows: They are the lines through a+wy—wj;, 
j=1,-++, %, perpendicular to the line joining a to the origin. When we move from/y toly=h, 
which is a translation by the vector w)y—w, each of the lines /; undergoes a translation of the same 
amount. 

§ These values of j will correspond co those points a-+-ws4 —w;(j #M) that lie in Hi™. 

| Hy ly 
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Hy™, it provides a pole for y(x+wa) at the same point, whose residue is 
promptly multiplied by —a;/as. Ia the light of this discussion it becomes 
clear that we have 


THEOREM 6.4. For each a in a sector} bounded by primary critical rays (and 
a not on a secondary ray), the function y(x; a) is a meromorphic function. Its 
only poles are simple, and they are at the pointst 


x=a+toy; x= at 2wy — w;, M);---; 
(n+ 1)ow — +--+ +.;,) 
ji M,joA# M,---, jn #M); 


and the corresponding residues are 


au au 


n 


We can now treat equation (6.3) in the case that F(x) is rational. 
Lemna 6.3. If F(x) is a polynomial, then (6.3) has a polynomial solution. 


To show this, we observe that if L(t) has a p-fold zero at t=0 (p=0 is pos- 
sible) then A,(x) is of degree exactly n—p, n=p. (A,(x)=0, n<p.) Let 
F(x) be of degree r. Then F(x) has an expansion F(x) =>. fiAysi(x), 
and y(x) => _ is a solution.§ 

Now let F(x) be a rational function. By virtue of Lemma 6.3 we can sup- 
pose that F(x) is analytic and zero at x= ©, so that it has the form 


i=0 


(6.14) 


jul rm (% — 


Now 


(6.15) 


(r— 1)! 


~ 


(x — a)" 


which gives us 


{ The same index M is retained for all values a in one and the same sector. We may call it a 
=m-sector. 

t The singularity of y(x; @) nearest the origin is at x=a+wy. 

§ If p0, the most general polynomial solution is obtained by adding to y(x) the polynomial 
Coteix+ + + + +¢p_1x?—!, where the c’s are arbitrary. If p=0, there is a unique polynomial solution. 


ee 
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THEOREM 6.5. If F(x) is a rational function, the equation (6.3) has a mero- 
mor phic solution. If all its poles lie in the same sector bounded by primary rays, 
and none of its poles is on a secondary ray, then on writing F(x) =P(x)+R(zx), 
where P(x) is a polynomial and R(~) =0 (so that R(x) is given by (6.14)), this 
solution can be written 


(6.16) = yo(x) + OD 


j=1 r=1 — 1)! 


Here yo(x) is any polynomial solution of L|yo]=P(x) assured us by Lemma 6.3. 
If the poles are not so located, then a constant y can be found so that the solution 
can be written 


(= iy, + ity) 
(6.17) = yo(x) + 


j=l r=1 (r 1)! dx"! 


The theorem is evident if the poles a; are in one sector and are off sec- 
ondary rays. In the contrary case a number y can be found such that the 
points a;+v¥ satisfy the previous condition. Set G(x) =R(x—v). Its poles are 
at aj;++, so that the sum in (6.16) represents a solution of L[y]=G(x) if we 


replace a; by a;+7. There now remains only to replace x by x+7 in the sum, 
and this gives (6.17). 

To treat the general case, in which F(x) is merely analytic, it is convenient 
to make a slight transformation on equation (6.3). The unique point x* of 
Lemma 5.1, which is contained in the region of convergence of all A,(x)- 
expansions, may conceivably be on a critical ray (whether of first or second 
type). This is undesirable. Let us make the translation x =x’+6. Then L[y(x)] 
becomes L,[y(x’)], where L; is determined by the function L,;(#) =e*- L(t). 
The zeros of L,(t) and L(t) are therefore the same, so that in going from opera- 
tor L to L;, the secondary critical rays are left unchanged. Again, the points 
—wi,- ++, —w, are replaced by —w:— 6, - - - , —w,.— 5, which is a translation 
of the points P;, - - - , Px; hence primary rays are also unchanged. Thus, in 
going from L to Ly, all critical rays are invariant. The translation on the points 
P; means that the new unique point x* is related to the old by x =x*—6; 
while the minimum value p* is left unchanged: pi =p*. 

It is therefore possible to choose 6 so that x# is not on any critical ray. 
Now in transforming from LZ to L;, the form of the equation is the same. 
Hence: we may assume from now on that in equation (6.3), x is not on any 
critical ray. This implies no loss of generality of equation (6.3). 
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We set down the following theorem (of which we have need), due to 
Carmichael: 


THEOREM 6.6. If F(x) is an entire function, then equation (6.3) has an entire 
function solution. 


7. The general case. To treat the general case where F(x) is merely ana- 
lytic, we follow a path suggested by §4. For this we need a function that plays 
a role analogous to that played by the function 7 cot 7(x—a) of (4.7). We 
shall be able to show that such a function exists. 

Let C be the unique circle, center at x* and radius p*, assured us by 
Lemma 5.1. As a consequence of uniqueness there are seen to be precisely 
two possibilities: Either 

Case I. There are at least two points P; on C, and of these at least one 
pair, say P;, P2, are diametrically opposite. 

Or, 

Case II. There are at least three points P; on C; of these no two are dia- 
metrically opposite, but at least one triad of them (say Pi, P2, P;) forms an 
acute-angled triangle. 

Case I. Let 1, 2,---, & stand for the points Pi,---, Px, and let 
1’, 2’,---, k’ be the points respectively symmetric to 1, 2,--- , k in the 
center x* of C. Consider the k—1 vectors v2, - - - , ¥% issuing from point 1’, 
along the lines 1’2’,---, 1’k’, but in the opposite sense, and of lengths 
1’2’,---, 1’k’. To the ends of these vectors we ascribe “coordinates” of 
(1,0,---,0),---, (0,0,--+-, 1) respectively. The point 1’ will have co- 
ordinates (0, 0, - - - ,0). Now form a “lattice work” in the plane, with these 
k—1 vectors. We thus get a countable infinity of “points” (m2, ms, --- , mx), 
aS M2,---, 2, run independently through all integral values (from —© to 
+). For example, the points 1’, 2’, - - - , k’ have coordinates (0,0, - - - , 0), 
(0, —1,0,---,9),---,(,0,---,0, —1). We shall refer to the diagram so 
obtained as Diagram 1 (for short, D.1). 

Associated with it will be Diagram 2 (D.2), obtained by constructing 
an ordinary rectangular axis-system in the (k—1)-space of the coordinates 
(m2,-- +, By the “point” (me, - - - , 2.) will be meant either the point in 
D.1 reached by starting at 1’ and laying off the vector mv2+ --- +ivx, or 
the point (m2, - - - , m,) in the rectangular space of D.2. 

In D.2 let us place a dot at every pointt (m2, - - - , m.) for which the cor- 
responding point in D.1 is in or on C, with the exception of (0,--- , 0) and 
(—1,0,0, -- - ,0). We shall say that a point so dotted is primary-dotted (p- 


t These Transactions, loc. cit., pp. 11-13. 
t The numbers mo, - - - , m are all integers, it is to be recalled. 
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dotted, for short). As against this, we shall dot certain other points, which will 
be said to be s-dotted (secondary-dotted). These points are determined as 
follows. Let (m2, - - - , m,) be any point in D.2, and consider with it the k—1 
points (me, m3, , Mj1, Mj +1, , Me), 7 =2,3,---,k. These k points 
form a figure that we shall call an L-figure, and the points themselves will be 
vertices. Observe that every point (m2, --- , m,) is a vertex of k L-figures. 

Consider any L-figure. If no fewer than k—1 vertices are p-dotted, then we 
shall s-dot the kth vertex (if it is not already dotted). Likewise, if in any L-figure 
no fewer than k—1 vertices are dotted (each vertex may be either p-dotted or s- 
dotted), then the remaining vertex (if not already dotted) is to be s-dotted. 

This being understood, we now raise the question: Is it true that neither 
(0,0,---,0) nor (—1, 0, -- - , 0) is s-dotted? We shall show that the ques- 
tion is to be answered affirmatively. The important fact in the proof is 


Property A. If (m2, ---, 2,) is in or on C, and is not at 1’ or 2’, then 
M3, , ts exterior to C for all} nd #nz. 

If k=2, then no point (m2) is p-dotted, so that neither (0, 0) nor (—1, 0) 
is s-dotted. We may then assume that k >2. It is conceivable that there is a 
point (#2, --- ,#,) other than (0, --- ,0) or (—1,0,--- , 0), which, in D.1, 
coincides with 1’ or 2’. This possibility gives rise to a situation that had best 
be treated after we become familiar with the slightly simpler, contrary, case. 


We therefore suppose, at present, that the only point (#2, - - - , m,.) coinciding 
with 1’ or 2’ in D.1, is, respectively, (0, - - - ,0),(—1,0,---,0). 

Consider the “plane” (as we shall call it) 72 =constant, say m2=n2 . In this 
“plane” there may be a set of p-dots. These p-dots may generate some s-dots. 
Let us call an elementary figure the set of p-dots in m2=n? plus all those s-dots 
derived from these p-dots (using no other p-dots). It is clear that all s-dots 
lying in an elementary figure lie only on /--lines (i.e., lines along which only 
the coordinate m2 varies), which already contain a p-dot in the “plane” 
n2=nz. If then we draw all the /-lines through the p-dots that lie in the 
“plane” m2=nz, these lines contain all the s-dots of the elementary figure 
determined by n.=n/. In particular, an /-line that contains no p-dot can 
contain no point of any elementary figure. 

Now consider a second “plane” ,.=nz’. Corresponding to its p-dots we 
get a second elementary figure, whose dots (whether p- or s-) lie on /-lines 
that are all distinct from the /»-lines of the previous elementary figure (Prop- 
erty A). It is possible that these two elementary figures (or even more such 
figures), when combined, give rise to s-dots not obtainable from any one ele- 
mentary figure alone. Let us examine such a possibility. Let (me, - - - , 2x), 


T D.1 readily shows the truth of this statement. 
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dz + , be the vertices of an 
L-figure of which k—1 vertices are dotted, thus forcing the kth to receive 
an s-dot. There are two possibilities: 

(i) The kth vertex is neither a; nor a2. Then a; and az are already dotted, 
and as they lie on an /z-line, they belong to the same elementary figure. Hence 
az (which is not p-dotted) could only have been s-dotted by the fact that the 
vertices d;, a3, -- - , a; are all in this same elementary figure. That is, the kth 
vertex that we are supposed to s-dot has already been dotted (either p- or s-) 
in an elementary figure. We therefore get nothing new. 

(ii) The kth vertex is a; (or a2). Then az (or a;) is already dotted, and inas- 
much as de (or a) already lies on an /,-line containing a p-dot, the same is 
therefore true of a; (or a2). Hence here again we never dot a point that is 
off the /z-lines through p-det points. Further, such new s-dots may give rise 
to additional s-dots, but never off an /.-line through a p-dot. 

It follows that if there exists an /2-line containing no p-dots, then such a 
line can never receive an s-dot. But the line - - - =n, =0 is sucha 
line. Hence (0, - - - , 0), (—1, 0, - - - , 0) will never be s-dotted, as was to be 
proved. We have supposed that no (m2, - - - , m,) other than (0, - - - , 0) or 
(—1, 0,---, 0) lies at point 1’ or 2’ in D.1. Now suppose there is such a 
point (#2, ---, #,),at 1’, say. It will then be true that for infinitely many 
sets of values (m2, - - - , m,) the corresponding point is at 1’. Consequently, 
in D.2, there will be infinitely many pairs of p-dots such that each pair lie 
on an /»-line (and are consecutive points on such a line). Consider such a 
pair, say a=(m2—1, m3, - - - , mx), b=(ne, M3, , From D.1 we see that 
the points (m2, m3, - - - , mx) and (m2—1, m3, , Mj-1, Mj+1, , Me), 
j7=3,4,---,, receive p-dots as also do the points (m2, m3, - - - , nj—1, 
nx), =2, 3, k. 

Corresponding to the “plane” 72.=constant passing through a there will 
be an elementary figure, which is independent of the fact that b is p-dotted. 
(Point 6 would receive an s-dot if it were not p-dotted.) The elementary 
figure for the “plane” 2=constant passing through 6 may contain s-dots 
that would not have existed if b were not p-dotted. But all such points in 
this elementary figure would become s-dots, anyway, when the two elemen- 
tary figures were considered together, even if b were not p-dotted. Hence all 
s-dots in an elementary figure still lie on /-lines through p-dots. On com- 
bining two or more elementary figures, the new s-dots obtained have the 
same property. Hence no /z-line devoid of p-dots can contain s-dots, and 
again we conclude that neither (0, - - - , 0) nor (—1, 0, - - - , 0) is s-dotted. 

Case II. Here we have an acute-angled triangle 123 (i.e., PiP2P3) in- 
scribed in C. Of the angles 1x*2, 2x*3, 3x*1, at least two lie between 90° and 
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180°. We suppose the indices 1, 2, 3 so chosen that two such angles are 1x*2, 
1x*3. 

Let us again take points 1’, - - - , k’ symmetric to 1, - - - , k in the center 
x*, and construct a “lattice work” with the vectors v2, - - - , 2% as was done 
in Case I. (Point 1’, for example, has again the coordinates (0, - - - ,0).) This 
gives us Diagram 1. Diagram 2 is constructed as before. We now make the 
same rules regarding p-dots and s-dots, with the single change that of all points 
(m2, - - + , Mx) in or on C, only (0, - - - , 0) is not to be p-dotted. 

We propose to show that (0, - - - , 0) is not s-dotted. 

Suppose k=3. Then by Lemma 5.2 (which applies here) we know there 
are no points (m2, m3) in or on C other than (0, 0), (—1, 0), (0, —1). There is 
therefore only one s-dot, namely at (—1, —1). Hence (0, 0) is not s-dotted. 

Now suppose k>3. We shell use the term 1-plane to mean a plane in which 
only the coordinates m2, m; vary (so that m, --- , m, are constant). Every 
L-figure has three of its vertices in a r-plane. We shall call the 2-dimensional 
L-figure, consisting of these three vertices, the base of the (original) L-figure. 
By /2(hs) we shall mean a line along which only the coordinate mz (coordinate 
M3) Varies. 

Consider any 7-plane. It may contain some p-dots. Let us examine where 
they may be. We shall assume, for the present, that no point (1, - - - , mx) 
is at 1 or 1’, except, of course, (0, - - - , 0) which is at 1’. Then an application 
of Lemma 5.3 gives us 


Property B. In each x-plane one (and only one) of the following three 
possibilities is realized: 

(a) There is no p-dot or there is just one. 

(b) There are just two p-dots, and these are consecutive points on either an 
he line or an hg line. (These two dots are then at two of the vertices of the base of 
an L-figure, one of the dots being a corner vertex.) 

(c) There are two or more p-dots, all of them consecutive on a line 
N2+n3=constant. Their coordinates (writing only the nz and n; coordinates) can 
be written: (nd , nj), (nd —1, mf +1), ---, (nd —7, ng +r). 


Let us now consider the possibility of s-dots. Suppose a given L-figure 
has k—1 p-dots, thus forcing an s-dot. From Property B, this s-dot must lie 
in the base of the L-figure, since no base is completely p-dotted. Suppose all 
the s-dots of this character have been put in, thus dotting the complete base 
of the corresponding L-figures.t We have thus enlarged the number of dots. 


Tt In case (c) of Property B, the s-dots put in may themselves fill the non-corner points of a 
base, in which case it may be necessary to s-dot the new corner points. We suppose this has been 
done as often as is necessary. 
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Call this system of p-dots and s-dots, K. Then it is readily seen that there 
exist no s-dots not in K. 

For let a be the first s-dot put in that does not belong to K. Then a lies 
in an L-figure the remaining k—1 vertices of which are in K. Now a cannot 
lie in the base of this L-figure, since K contains all dotted points lying in a 
base of an L-figure. It must then be that in this L-figure (containing a) the 
base lies in K. But at least one vertex of the base is s-dotted, and this s-dot 
was forced by having all the other vertices already dotted (and hence in K). 
Therefore the assumed point a does not exist. 

We conclude that a point P(me, - - - , m,) cannot be s-dotted if, when we 
consider only the 2-dimensional L-figures lying in the w-plane through P, 
then P is not s-dotted. Now the z-plane for which m= - - - =n,=0 contains 
the point (0,---, 0), and in it only the points (0, —1, 0,---, 0), 
(—1,0, - - - ,0) are p-dotted. Hence in this r-plane the only point that could 
possibly be s-dotted is (—1, —1, 0, - - - , 0). In particular, then, (0, - - - , 0) 
is never s-dotted. 

We may now remove the restriction that no point (m2, - - - , m,) can lie 
at 1 or 1’ other than (0, - - - , 0) at 1’. This now permits the existence of 
infinitely many z-planes that contain three p-dots.{ This increase in p-dots 
may permit more s-dots than was the case before, but as was true in Case I, 
the slight increase in complexity will not alter the fact that (0, - - - , 0) re- 
mains free of an s-dot. 

Summing up, we have 


Lemma 7.1. If p-dots are placed in Diagram 2 at all points (m2, - - - , mx) 
that are in or on C, save (0,---, 0), (—1, 0,---, 0) im Case I and save 
(0, - - - , 0) im Case II, and if s-dots are placed in accordance with the L-figure 
rule, then (0,---, 0), (—1, 0,--- , 0) remain free of a dot in Case I and 
(0, - - - , 0) remains free of a dot in Case II. 


Consider the homogeneous equation 


(7.1) L[Y (x; = 0. 


Assume a formal solution 


+0 


(7.2) Y(x; a) ~ 


[x (a + w) no(w1 = we) n3(nr ws) ny(or wr) } 


Tt If (m2, - - + , m) is at 1, the three p-dots make up the base of an L-figure; but if (mz, - + - , mz) 
is at 1’, two of the p-dots are non-corner points of a base, while the third is the vertex opposite to the 
corner point in the unit square that contains the base. 
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where —a, is the point P; of Case I or Case II. On substituting into (7.1) 
we obtain 


E nj(ar — | 


2 


j=2 


and this is formally satisfied if the b’s are chosen so that 


k 


j=2 


(m2, =0, +1, +2,--- 


Choose a =2* and let 


k 
(7.4) x* nor — w)). 
j=2 
The poles of Y(«; x*) (in a purely formal way) are among the points 7. Now 
x*+w, is at a distance p* from the origin, and the line joining it to the origin 
is parallel to 11’ in D.1. Hence those r’s which are at a distance not exceeding 
p* from the origin correspond to those vectors 


k 
(7.5) = Dy — 


j=2 


which, when laid off with initial point at 1’, have their terminal point in or 
on C. Now when so laid off, the ’s give us precisely the points (m2, - - - , mz) 
of the lattice work of Lemma 7.1. If we wish not to have poles 7 at a distance 
not exceeding p* from the origin, then we must choose b,,....,=0 for every 
(m2,- ++, Mm) in or on C (in D.1). We shall, however, except the values 
(0,---,0), (—1, 0, ---,0) im Case I, and the value (0, - - - , 0) in Case II 
(i.e., the corresponding b’s are (or 6 is) not to be chosen zero). Otherwise 
put, all b’s corresponding to p-dots are to be chosen zero. Now on so doing 
(7.3) informs us that whenever k—1 b’s, corresponding to k—1 vertices of an 
L-figure, are zero, we must choose the kth 6 also to be zero. That is, b’s 
corresponding to s-dots are also to be chosen zero. In the light of Lemma 7.1 
we have 


Lemma 7.2. Constants by,...n, exist such that there is a formal series (7.2), 
with a=x*, satisfying (7.1), and such that 

(i) bo,...,0%0, b_1,0, ....0*0 in Case I, and in Case II; 

(ii) no formal pole of Y (x; x*) is at a distance not exceeding p* from the 
origin, save To,...,0, T-1,0,...,0 im Case I and save To,...,9 in Case II. 


= 
‘ 
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We shall hereafter assume that the D,,...,,’s have been chosen to satisfy 
Lemma 7.2. Also, from the homogeneous character of equations (7.3), we 
may suppose that bo,...0=—1/a:. We dare not expect that series (7.2) is 
convergent. But by the classic theorem of Mittag-Leffler, there exist poly- 
nomials P,,,...n,(x) such that the series 


+ 


(7.6) (i 


N2,°** 


k 
(x*+01)— n ;(w1—@;) ] 
j=2 
converges uniformly and absolutely in every bounded region, the zeros of the 
denominators deleted.| The operator L may be applied term-wise to (7.6), 
valid for every x not a pole; and the only possible singularities of L[Z] are 
at the points 7,,,....,,- Now on suitably grouping terms in L[Z], as is per- 
missible because of absolute convergence, we get 


te 
(i + + 01) 


k 
j=2 


j=? 


k 


where A,,...n, is the left-hand member of (7.3), and is accordingly zero. 
Hence L[Z] does not have any poles.t That is, L[Z] is an entire function. 
Now by Theorem 6.6 there is an entire function g(x) satisfying 


L[g(x)] = L[Z(x; «*)], 

so that the function W(x; x*) =Z(x; x*) —g(x) satisfies 
(7.7) L[W(«; x*)] = 0. 
We therefore have 

THEOREM 7.1.§ The homogeneous equation (7.7) has a meromorphic solution 
W(x; x*) which has a pole at x=x*+o1, and which has no other pole whose 
distance from the origin does not exceed p*, except in Case I when there may be a 
pole at x =x*+w». 

t Z(x; x*) has then no pole at a distance not exceeding p* from the origin save for 79 
T-1,0,---,0 in Case I and save for 7o 

t For if in the series for Z (and therefore for L[Z]), any term is omitted, the resulting series 
converges even at the pole corresponding to the omitted term. 


§ In §4 the function (x; a) was already at hand. It has been the purpose of Lemma 7.1 to 
enable us to assert the existence of a similar function in the present case. 
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In Theorem 6.4 we considered a in a sector 2 bounded by primary rays. 
From the nature of primary rays it is readily seen that to every j for which 
the point P; is on C there certainly corresponds a 2;-sector. Since point P; 
is on C, there is therefore a 2,-sector; and from a previous observation, 
we may suppose with no loss of generality that x*, and in fact the whole of 
circle C, is in 2}. 

Consider Case I. Let p’ be a number only slightlyf larger than p*, and 
with P; and P, (i.e., —w; and —w:) as centers, draw arcs of radius p’. They 
will form a lens-contour, the two arcs of which we shall call Ai, Az (A; being 
part of the circle with center at P;). The function y(x; a) of Theorem 6.4 
(with M=1 in the present case) has as its pole nearest the origin the point 
x=a+w;. For a=x*, this pole is at a distance p* from x=0, and no other 
pole is within this distance of x =0. 

Now let a@ trace the lens A (consisting of the arcs A;, Az). Then a+a, 
describes a congruent lens A’, obtained from A by a translation so that the 
new center is at x*+, instead of at x*. Let the corresponding arcs of the 
new lens be Aj, Az. One of these, namely A/, is an arc of a circle of center 
x=0 and radius p’, and consequently is wholly exterior to the circle of 
radius p*. That is, for all a on Ai, y(x; a) has no singularity at a distance 
from x=0 less than or equal to p*. If we write 


(7.8) y(x; a) = ea(a)x", 


this series converges uniformly in a@ and «x for all a on A; and all x in 
|x| <p*+e (€>0 sufficiently small). 

Now let a@ be on Az. If in Y(«; x*) we replace x* by a, the result is to give 
us the function W(x; a) obtained from W(x; x*) by replacing x* by a@ in 
Z(x; x*). Inasmuch as «* and x enter W(x; x*) only in their difference x —x*, 
we have 


(7.9) W(x; a) = W(x + x* — a; 2*). 
Consider the function 
(7.10) yilx; a) = y(x; a) + W(x; 


For p’ sufficiently near to p*, and for a on Az, the only possible poles of 
yi(x; a) whose distance from x=0 does not exceed p* are at x=a+wi, 
x =a+we, since this is true separately of y(x; a) and W(x; a). 

Consider the point x =a+w,. At this point y(x; a) has a simple pole with 


Tt Just how close to p* it must be taken will soon appear. 


4 
i 
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multiplier 1/a;. But also W(x; a) has a simple pole there, with a multiplier 
bo,.. 0 Which we chose equal to —1/a:. Hence 4:(x; a) remains analytic at 
x=ator. 

Now let us examine x=a+we. The point x*+u» is at a distance p* from 
x =0, and is diametrically opposite to x*+w., from x=0. Hence as a@ traces 
arc A», the point a+w, traces an arc of a circle of radius p’, center at x =0. 
Hence even if a+w, is a pole of y:(x; a), it lies at a distance greater than p* 
from x=0. Then, 


(7.11) yi(x; a) = >> d,(a)x", 
0 


uniformly convergent in @ and x for all a on A: and for all x in |x| <p*+e, 
sufficiently small. 
We have 


1 
7.12) lim sup] ins lim sup | d, line 
(7.12) lim sup | ea(a) |!" ——— p| data) 


so that we can operate on (7.8) and (7.12) with L, gettingt 


(7.13) L[y(x;@)] = en(a)An(x), Llyi(x; a)] = de(a)A,(2), 


the series being uniformly convergent for all a, x such that a is on Aj, Az re- 
spectively, and x is in a closed region containing x* and of diameter suffi- 
ciently small. But L[W] =0, so that L[y]=L[y,] =1/(x—a). Hence 


n=0 n=0 
uniformly convergent for all a and x such that a is on Aj, Ae respectively, 
and x is in a small enough neighborhood of x =x*. 

Now we take up Case II. p’ being a slightly larger number than p*, we 
strike three arcs A;, As, A; of radius p’ and centers —w;, —we, —ws;, forming 
a curvilinear triangle. When a traces A;, the point a+, traces a congruent 
arc with center at x =0, so that the pole a+w, (which is the nearest pole to 
the origin) of y(x; a) is at a distance from x=0 exceeding p*. Hence (7.8) 
continues to hold, a remaining on A;. For a on either A: or A;, the function 
yi(x; a) of (7.10) has no singularity at x =a+w, (as in Case I). But (cf. Theo- 
rem 7.1) y:(x; a) can have no other singularity within a distance p* of x =0. 


t This follows on using Theorem 6.2. 


a 

is 
| 

4 

0 0 
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Hence there are no singularities within this distance, and (7.11) holds uni- 
formly for a on A,+A3;. From this follows the continued validity of (7.13) 
and (7.14), for a on A; and on A», A; respectively. That is, we have 

THEOREM 7.2. The function 1/(x—a) has the A,-expansions of (7.14), uni- 
formly convergent, respectively, for all « on A, and all a on Az in Case I (and all 
a on Az and A; in Case II), and for all x in a sufficiently small neighborhood 
of x=x*, 

Now let F(x) be analytic about x=«*. Then there exists (according to 
Case I or Case II) a lens or circular triangle around x* lying with its boundary 
wholly in the region of analyticity of F(x). On applying the Cauchy formula 
to series (7.14), term-by-term integration being of course allowable, we 
obtain 


(7.15) F(x) = frAn(x), 
where 


1 1 
(7.16) fa = J cx(a)F (a)da 


Tt TUS Ay or 


the series being uniformly convergent for x sufficiently close to x*. Hence 


THEOREM 7.3. If F(x) is analytic about x =x*, it has a convergent A ,-expan- 
sion, given by (7.15). 


Combining this with Theorem 6.1: 


THEOREM 7.4. A necessary and sufficient condition that a function F(x) have 
a (convergent) A,-expansion is that it be analytic at x =x*. 


By Theorem 6.2, lim sup |f,|"<1/p*, so that the series 


(7.17) y(x) = > fax” 
0 


converges in |x| <p*+e, for some e>0. On applying L to (7.17) we get 


L[y(*)] = = = F(x), 
so that 


THEOREM 7.5. If F(x) is analytic about x=x*, then the function y(x) of 
(7.17) is analytic in a circle about x =0 of radius greater than p*, and for all x 
in this circle y(x) satisfies the equation 


(6.3) L[y(x)] = F(x). 
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The point «* is of course significant for A,-expansions, but not for equa- 
tion (6.3). For let F(x) be analytic about x =c, and define G(x) =F(x+c—x*). 
G(x) is analytic about x =2* and therefore there exists a function 2(x) such 
that L[z(x) ]=G(x). Consequently, the function y(x) =2(x—c+<*) satisfies 
L|y(x) ]=F(x), and we have the final 

THEOREM 7.6. If F(x) is analytic about x=c, there exists a function y(x), 
analytic about x =c—x* in a circle of radius exceeding p*, such that for all x 
in this circle y(x) satisfies equation (6.3). 
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ON PRIME NUMBERS OF REAL QUADRATIC 
FIELDS IN RECTANGLES* 
BY 
HANS RADEMACHER 

In a recently published paper} I investigated the number P(x, x’) of 
primes w in a real quadratic field satisfying the inequalities 0<w<z, 
0<w’<zx’. The method used there depends on certain refined estimates of 
the “angular” distribution of prime numbers in real quadratic fields. In 
the present paper I propose to give another more direct proof for the estimate 
of P(x, x’), which starts from the obvious remark that for a totally positive 
unit 7 of the field we have 


(1) P(x, x’) = P(xn, x’’). 


This fact leads to the periodicity of the function P(xn’, x’n’*) with respect to 
the variable v, and subsequently to a Fourier development of P(xn’, x’n’") as 
function of v.§ 

Our function P(x, x’) is a special case of a more general type, which may 
be described as follows: Let f(u, u’) be a function defined for all integers u 
of the real quadratic field & (u’ being the conjugate of u) having the property 


(2) S(u, w’) = f(un, u’n’) 
for all totally positive units 7 of the field. Then we have 
F(x, x’) fee) = fw)= flan, 


O<u<cz zn O0<un< zn 
0<n’< 2’ O<u'n’< O<u'n’< 


flv, = F(axn, 
zn 


analogous to (1). But F(x, x’) is of course discontinuous and hence would 
not furnish an absolutely convergent Fourier series, which however we need 


* Presented to the Society, October 26, 1935; received by the editors October 4, 1935. 

t Uber die Anzahl der Primzahlen eines reell-quadratischen Zahlkorpers, deren Konjugierte unter- 
halb gegebener Grenzen liegen, Acta Arithmetica, vol. 1 (1935), pp. 67-77, subsequently cited as “K.” 

t Primzahlen reell-quadratischer Zahlkorper in Winkelrdumen, Mathematische Annalen, vol. 111 
(1935), pp. 209-228, subsequently referred to as “W.” 

§ My attention was drawn to this use of the Fourier development by a remark of Siegel, who, 
as I have heard from him, some years ago found an identity similar to formula (13). In my former 
method the Fourier expansion was used at another step of the proof, viz., in connection with the 
angular distribution of the primes. 
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for the application to the estimate of the number of primes in certain rect- 
angles (§2). We therefore prefer to investigate 
Fi(x, x’) = DO (« — — w')f(u, 
(the notation 0-34-3x being an abbreviation of both tue inequalities 0<p<x 
and 0<y’ <x’ together), which yields an absolutely convergent Fourier se- 
ries. 

In §2 we specialize f(u, uw’) and hence F,(x, x’) for our prime-number 
problem and then have to make use of some results given in “W” concerning 
Hecke’s ¢(s, \)-functions. In §3 we are in a position to go back from F,(x, x’) 
to F(x, x’) in this special case. Our principal results are the formulas (13), 
(33), (43). 

For the sake of simplicity we content ourselves with these formulas. Of 
course no fundamental changes would occur if we introduce an ideal modul a 
and a fixed algebraic integer x and then sum only over the integers p=x 
(mod a). Instead of the totally positive fundamental unit 7 we should have to 
use the totally positive fundamental unit 7, mod 4, i.e., with 7,=1 (mod a). 
But the more general result having been already given as a theorem of “K,” 
page 76, and our main interest being at present the exhibition of the other 
method, we confine ourselves to the case a= (1). 


1. A FouRIER EXPANSION 


Let 7>0 be the totally positive fundamental unit, and especially »>1; 
we then have 


n’ > 0, 
Let f(u, u’) have the property (2). Then we build up the function 
(3) F(x, x’) = (x — u)(x’ — w')f(u, 


Now we have 


F,(xn, x'n') = (an — w)(x'n’ — ’)f(u, 
0343 zy 


= (x9 — m)(x'n! — v'n’)f(vn, v'n’) 
zy 


= (x —v)(x' — v')f(v, v’) = Fi(x, x’). 


Therefore F;(xn’, x’n-*) as function of v is periodic with the period 1. It has 
obviously a bounded derivative and therefore can be developed into an ab- 
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solutely convergent Fourier series, which we write down immediately for the 
special value v =0: 


F,(x, x’) = > x'n~*)dv 


> en > (xn” f(u, u’)dv. 


n=—x» O<u< zy” 
O<u’< 


If we collect associated numbers, i.e., numbers differing only by factors that 
are powers of n,* we get 


x’) 


zy” 
N (u)< zz’ 2’ 
where the notation (u); indicates that only one representative u is taken out 
of each set of associated numbers (in the narrowest sense). The finite summa- 
tion over (u); may at once be interchanged with the integration, and in the 


exponential function we can replace v by v—k: 


F,(x, x’) 


1 
«¥ > flu, u of > — — *) dv. 
k 


()1 
O<un*< zy” 
N (u)<zz’ O<p’ 


If we now interchange also the integration and the summation with respect to 
k, we observe that v is not only bounded by 0<v<1, but for each & also by 
un* and by <x’n~*, which we can express as follows: 


Fi(x,x)=> D flu, — — dd, 


n=—o (#)1 
N (u)< zz’ 
the range of integration in v for each & being determined by the two conditions 
x” 
bk, 
x 
only those values of k are admitted for which the second condition yields a v 
between 0 and 1. We substitute w for »—k: 
* More precisely these numbers should be called “associated in the narrowest sense,” as only 


totally positive units and not all units are admitted as factors. We shall have to recall this distinc- 
tion later, on p. 387. 
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(#)1 
N(u)< zz’ 


with the conditions of integration 


Jog n 
But & running through all appropriate integers, the integrals for successive k 
unite to one integral, the boundaries of which are given by the second condi- 
tion alone: 


log /log 4 


+00 
D fv’) xn” — — p')dw 
n=—0 (#)1 log (u/z)/log 
N(u)<zz’ 


(ut) 1 
N(u)< 22’ 


(4) 


say. We have now to evaluate the integrals /,(u, y’). 
First, for n+0, we get 


log (z’/u’)/log 
pn’) = f { (aa + pp’ )e~ ine ous px! log 4 
log (u/z)/log 


xem log 1} dw 
{ xa’ + N(u) N(u) 
2rin 2rin+logn 2mrin — logy 


{= + N(u) xx’ N(u) 
x 2rin + logy — logy 
rin/lo —rin/log 
log n (=) og ” («) (y) 


2Qrin\ x’ 


{ xx’ N(u) 
—logn 2mrin+ logy 


(=) (xx’)*in/log N(x) rin/log 


2rin \ x’ 


{ xx’ N(u) \ 
2xrin+logn — logn 
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and finally 
l rin/log 4 —rin/log 


2rin\ x 


| xx’ xx’ 


2mrin — logy 


” N(u) 1+rin/log 
( xx’ ~ ( xx’ ) 


+ log n 


For n =0 we get by an easy calculation 


log (z’/u’) /log 9 
To(u, f { (xx’ + pu’) logy — yy’ xew log dw 
log (u/2)/log 


(6) , 


Before introducing (5) and (6) into (4) we make use of the following 


— 2(xx’ — 


Lemma. Let 0<y, y'=y*!°*4 with the principal value of log y, and a, B com- 
plex numbers, c real with c>max (R(a), R(B)). Then 


O0<ysl, 


1 e+ iso 
—f ds = 
2rid (s + a)(s + B) sy. 


To prove the lemma we consider first an integral extended from c—iQ 
to c+iQ with large positive 2. This path of integration parallel to the imagi- 
nary axis is then to be replaced by a half-circle with radius Q, center c; for 
0<y<1 we take the half-circle to the right-hand side (side of the positive 
real part in the s-plane), for 1 < y we take the half-circle to the left-hand side. 
An easy estimate shows that on both half-circles the integral tends to zero 
with infinitely increasing radius 2. In the first case no pole is enclosed be- 
tween the new and the old path of integration, in the second the poles —a 
and —8. The calculus of residues then yields the result. 

An application of (7) to (5) gives for 0<N(u)/(xx’) <1 


(v5) 


1 ct 
(s+ 
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cin — + 1+ iyn) 
with c>0 and the abbreviation 


(9) y= 


logy 


The two integrals can be contracted into one: 


xx’ x —iyn 
) = (<= (= 
logy \x 


xx’ y 
1 (= 


(S + iyn)(s — iyn)(s + 1+ iyn)(s + 1 — iyn) 
The definition (4) shows clearly that J,(u, u’) depends on m continuously. 
Hence (10) is valid for 7=0 also. This could of course be verified by direct 
reference to (6). 

By our lemma the integrals in (8), and therefore also the integrals in (10), 
are equal to zero for 1<N(u)/(xx’) or N(u)2xx’. Hence it is unnecessary 
after the introduction of (10) into (4) to restrict the summation to the range 
N(u) sxx’, and thus we get 


xx’ +” x \irn —iyn 
== ¥ (=)"E su, 
log n=—o \% 


1 N(u) 
(S + iyn)(s — iyn)(s + 1+ iyn)(s +1 — iyn) 


Up to this point the function f(u, u’) has been subject to no other condi- 
tions than (2). Let us now assume 


w’) = O(| |*) 


with a certain real a. If we then introduce 


)1 N(u)* 
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the series is absolutely convergent for #(s) >a+1. In this half-plane the func- 
tion Z,(s) is certainly regular. The number c >0, determining the path of in- 
tegration in (10), can be chosen as greater than a+1. Then the interchange 
of the summation over yu and the integration is justified. From (11) and (12) 
we get 

, , wx’ < 

03432 log 1 

(13) 

1 pete (xx’)*Z,(s) 


ds. 
Qui J (8 + — tyn)(s +1 + tym)(s+1— 


If we treat the term 2=0 separately by using its original form (6), we have 


1 xx’ 
> flay + N(u)) log 
log 9 (#)1 N(u 
N (u)< 22’ 


— 


<2. 


log 1 
1 e+ ino (ax’)*Z,(s) 


ds, 
(5 + — tyn)(s + 1+ iyn)(s +1 


the prime at the summation sign meaning the omission of m=0. In formulas 
(13) and (13a) we have attained the objects of this paragraph. It may here 
be added that by quite analogous reasoning and calculation we can derive 
the formula 


x’) = "flu, v’) 


= AY 
log 7 \ x" (S + iyn)(s — iyn) 

where >-* indicates a special treatment of the boundary summands: a term 
f(u, »’) with p =x, uw’ <x’ or with u<x, p’ =x’ is only to be taken into account 
as 3f(u, u’), whereas a term with u=x, »’ =x’ does not count at all. But the 
series in (14) with summation over is not absolutely convergent and there- 
fore not suited to our further applications. 


I am in the possession of a more general formula, of which (13) and (14) 
are special cases and which I hope to communicate on another occasion. 


(14) 


2. ESTIMATE OF PRIME NUMBERS IN RECTANGLES 


We shall now specialize our formula (13) for treating our prime-number 
problem. Let us put 


x iyn 
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1 for (u) prime ideal, 


(15) flu!) = { 


0 otherwise. 


For this special f(u, wu’) the function F(x, x’) may be called Pi(x, x’). In this 


case we have 7 
(16) Zs) = G 


WN (w)* 

w running only through prime numbers, i.e., integers, whose principal ideals 

(w) are prime ideals. We are in a position to discuss the function by reducing 

it to Hecke’s well known {¢(s, \)-functions. 

For this purpose we introduce following Hecke} “ideal numbers” ji, which 
together with the numbers of the algebraic field & constitute a certain larger 
realm 3 in which multiplication, division, and the operation of determining 
the greatest common divisor are possible without exception, and in which more- 
over all units belong to the given field k. These numbers can be separated 
into 2°/ classes under the stipulation that two belong to the same class when 
and only when their quotient is a totally positive (integral or fractional) 
algebraic number of the field. The number / is the ordinary class number. 

Let x(&) be a character of the Abelian class group of order 27h; the unit 
element of this class group being the class of all totally positive algebraic 
numbers of the field k, we have x(u) =1 for u&0, and especially x(n) =1 for 
all characters x. Hence we have 


x(@)A” (0) | 
be 


where we use Hecke’s notation 


i “~\-—ri/log 
~ 


and (4); means again that out of each set of ideal numbers associated in the 
narrow sense we have only to take one representative. But two numbers not 
associated in the narrow sense may well be associated in the ordinary sense. 
If we select only non-associated numbers in the ordinary sense, we must con- 
sider units @ which are not narrowly associated: 


x(@)A"(@) 
17 0)X"(0). 
(17) (s) = x(6) (0) 


} E. Hecke, Eine neue Art von Zetafunktionen und ihre Beziehungen zur Verteilung der Primzahlen, 
II, Mathematische Zeitschrift, vol. 6 (1920), pp. 11-51. 
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The units not associated in the narrow sense form a group of order 2¢, where g 
is defined by 


(18) n = 


with e>1 the ordinary fundamental unit. (For g=1 we have only +1 and —1 
not associated in the narrow sense; for g=2 the units +1, —1, +e, —e.) 
Obviously x(8)A"(@) is a character of this group of units. 

Now we have to distinguish two cases: 

(i) x(@)A"(@) = 1 for all units This occurs for certain x, which have the pro- 
perty x(@) =A~"(6) for the finite group of not narrowly associated units. Such 
x always exist. Indeed, for given \” they are determined at first only for the 
subgroup of such classes which contain units. But according to a general 
property of characters of Abelian groups it is always possible to extend a 
character given on a subgroup to the total enclosing group. The number of 
such characters is equal to the index of the subgroup in the total group; in 
our case therefore 4h/(2q). For such x we have 


(19) = 2g = 
(@) 4 log 

because of (18). In this case we call (in a slight modification of Hecke’s 
terminology) the product x(%)A"(f@) an “angular character for ideals” and use 
the abbreviated symbol xA"(z). In fact because xA"(@)=1 the character 
xA"(ji) has the same value for all @ representing the same ideal. 

(ii) Not for all units 6 is x(@)A"(@) =1. Then x(@)A"(@) is not the principal 
character for the subgroup of the classes containing units, hence we have 

x(0)A"(8) = 0. 

In (17) therefore we have only to consider such x as give rise to an angular 
character xA"(f) for ideals (such x exist, as mentioned above, in number 
2h/q). If in a summation over x we have to select such x in the manner men- 
tioned with regard to X\” we will mark it by a subscript \" attached to the sum- 
mation sign. From (17) and (19) we have 


log 


20 Z,(s) = ——— Do» = 
(20) 2h loge | N(@)|* 2h loge 


say 


On the other hand we have the definition of Hecke’s {(s, \)-functions: 


aS 
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valid for R(s) >1, and hence 
1 xrA"(d) 
log ¢(s, xd") = —( 
og x ) 
xA"(@) = (xA"(@))™ 
() | N(@) (6) m=2 m| N(@) |™ 
= Z(s, xd") + E(s, xd"), 


say. 
Now &(s, xd") is regular for R(s) >4, since the defining series is absolutely 
and uniformly convergent for R(s) 
1 
@) m|N(a)|™ 2 | |% 1 
| |" 
1 1 1 1 1 
{= < 
2 2 | 1— 2 — | N(p) | 20 


1 


which shows moreover that for #(s) >} the function Z(s, xd") is bounded. 
We can write 
log 9 
21 Zn = 
@1) 2h log € 
where B,(s) is bounded for R(s) =oo>4 and all m. If we insert (21) and (15) 
in (13a), we get 
xx’ +2 / 
Pi(x, x’) = DO (x—)(x’ = +H 2 (=) 


03032 2h log € 


log xA") + Bn(s), 
x 


x 


log £¢(s, xA*) 
n Ss 
+ iyn)(s — iyn)(s + 1+ iyn)(s +1 — 


t2 
log 7 n=—wo 


1 2+ ico (x2x’)*B,(s) 
J, 


ds 
ix (S + iyn)(s — iyn)(s + 1+ iyn)(s + 1 — iyn) 


Oni 


xx’ 


— 2(xx’ — 


{ (a + N(w)) log 


N(w) <zz’ 
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The two infinite sums over 7 ~0 in (22) are now to be estimated. We be- 
gin with the second of these sums, which is easier to handle. In the following 
the letter C is used for positive constants, not necessarily always the same. 

The functions B,(s) being regular and bounded for R(s) =o.>3, we can 
shift the path of integration to the left up to the abscissa ¢. Then we have 


f 3/4+i2 (xx’)*B,(s) 
(SH 


+o dt 
<C 


By reason of symmetry the latter integral from — © to + can be replaced 
by twice the integral extended from 0 to + ; without loss of generality we 
can further suppose  >0: 


dt 
4 


sx’ t+ ( x 1 (xx’)*B,(s) 4 
— 

< C(xx’)7/4 > 


n=1 


1 
— = C(xx’)™4, 
n2 
For the estimate of the first infinite sum over m in (22) we shall make use of 
the following 

Lema. There exists an absolute constant E=1 such that in the region 
(boundaries included) of the plane of the complex variable s =a +it 

1 1 
6000 (log (1 + | |) + log (1 + | ¢|)) 


o<3, 


7 
2 dt 
7 
4 4 
Hence 
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1 
6000 (log (1 + | ) + log (1 + B)) 


oS 3, |¢| < E, 


the function log §(s, xd"), for n¥0, is regular and satisfies respectively the in- 
equalities 


log ¢(s, xd") | C(log? (1 +| |) + log? (1+ |e] =, 
< C(log? (1 +| ) + log? (1 + SZ. 


As to the proof, this lemma is only a slight modification of Theorem 6 in 
my paper “W.” We have only to consider log ¢(s, xX") instead of (¢’/¢)(s, xX") 
in that theorem. The Carathéodory-Landau lemma permits then all the nec- 
essary conclusions in the proof. Moreover, in the wording of the present 
lemma we have replaced r(t, by its definition r(¢, m) =(1+|¢|)?(1+||)*. 

By means of this lemma we are in a position to transform the path of 
integration from 2—i to 2+7 into the following path P: 


1 
6000 (log (1 + | |) + log (1 — 
1 
=1-— 
6000 (log (1 + | | ) + log (1 + E)) 


1 
We thus have 
f (xx’)* log xd") ds 
p (s + iyn)(s — iyn)(s + 1+ iyn)(s +1 — iyn) 
E (6000(log (1+ (log? (1+ E) + log? | )) | 
(Jog? (1 + 4) + log? (1 + | n| )) 
=I,+ 12, 
say. Without loss of generality we can again assume ” >0. We have 
I; < (itn) 


dt 
- log? (1 + f , 


(24) 


+C 


and the same conclusions which led to (23) yield here 
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log? (1 + 

n 

As to J, we intersect the path of integration at the points 


K = max (E, yn), L = max (E, 2yne ls =2’)"/2) 


K L 20 
f -f +f +f =Ji:+J2+Js, 
E E K L 


say. We assume always xx’ > 1; it follows that (log xx’)? >0 and henceK SL. 
It may be that E=K or E=K=L, in which cases J,=0, or J:=J2=0, re- 
spectively. 
For J:, only E<K remains to be considered. We have 
J - f yn (x2’)!-e/ (log (1+t)+log (+")) (log? (1 + t) + log? (1 + n)) 
(xx’)!-¢/log (1+n) log? (1 + n) yn dt 
(3 + yn)(3 + yn) 0 (f+ yn— yn — 2) 

log? (1 + n) 


n2 


and have 


<C 


This estimate of course is also valid for E=K. 
Secondly, for K =L we have J,=0. If we assume K <L we have 


2yn exp ((log rz’ )1/2) (log (1+n)+log (log? (1 + n) + log? (1 + t)) 
hs f 
G+tt +t— + — yn) 
(log (1+n)+(log zz’ )1/2) log? (1 + t) 
Y 


<C 


(3 + 2yn)(} + 2yn) n (3 yn) 


<— (xx’)!-¢/ (log (1+n)+(log zz’ )1/2) 
2 


Finally we have 
hes xx’ log? (1 + m) + log? (1 +4, 
2 2 


yn exp ((log zz’ )1/2) (t — yn)? 


yn exp ((log z2z’)!1/2) (3yn)? 
ss’ f* log? (1 + #) sz’ f° dt 
f f — 
ye exp ((log n yn exp ((log z2’)1/2) 13/2 


xx 
<CcC— e7 (1/2) (log xz’ )1/2 
n? 


[May 

| 
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These estimates together furnish the inequality 
log? (1 +| 


n 


Collecting our results we get from (22), (23), (24), (25), (26) 
Dd (« — w)(x’ — w’) 


= log? (1 
= H + O((xx’)7/4) + o( (1 + #) 


n=1 


—e log rz’ / (log (1+n)+(log 


In order to calculate the sum in the second O-term we put 


[exp ((log zx’ )1/2)]-1 
+ 
[exp((log zz’)1/2)] 
< (¢/2) (log + 
n=[exp ((log z2’)1/2)] n2 
log? 
< Ce~(e/2) (log zz’) 1/2 (1/2) (log x2’ > 


Hence our result reads as follows: 


(27) Pilx, x’) = — w)(x’ — w’) = + O((xx')%e~ 
O3w32z 
with c in a new meaning. 
Our next aim is to give a concise formula for H. We start with the formulas 
q dt 
28 1=— + O (log 
(28) 2hJ2 logt 
N(w)<y 


and 


(29) log N(w) = y + v)t/2) 
2h 


N(w)<y 
where g has the same meaning of as in (18). These equations are well known.* 
The divisor 24/q indicates the number of classes of ideals, from which in the 
summations only one is considered. 
If we have the equation 


* Edmund Landau, Uber Ideale und Primideale in Idealklassen, Mathematische Zeitschrift, 
vol. 2 (1918), pp. 52-154, Theorem LXXXIII and Theorem LXXXV. 
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Co = g(y), 


(@)1 
Ni(w)<y 


we can conclude 


Y 
(30) Dedy= Daf — MW). 
2 (w)1 (w)1 
Ni(w)<y N (w)<Y N(w)<Y 


From (30), (28) and (29) we deduce — for c,=1 and c,=log N(w) 


(@)1 
wf0 
N(w)<Y 


— N(w)) log = ydy + o( f vray ). 
2 2 


N (@)<Y 


Now we have 


fie v dt dt 
2 log? 2 log? 2 log ¢ 


and 


(log ((log ¥) /2)¥/2 d 
—ellog y = Clog 
J + < 2 J... yey 


= (log ¥)12) | 


and therefore 


Y(Y — 
0 
N (w)<¥ 
q ¥? 
dX (¥ — N(w)) log Nw) = — — + 
(@)1 2h 2 
N(#)<Y 


These equations together with (28) and (29) yield 


Y tdt 
(31) 2 — + OF reer), 
2 log ¢ 
N (#)<Y 


394 [May 
Y N d +O —c (log weg 

(w)) rh log | ye y | 
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y? 
(32) N(w) log N(w) =— —+ O(Y 2e~¢(loe | 
N(w)<¥Y 


With the abbreviation Y =xx’ we get from (22a) 


1 
H= “so > {¥dog ¥ — 2) + N(w)(log Y + 2) — Y log N(w) 
og 7 (@)1 
A — N(w) log N(w)} 


and hence 
2) f doer +2) f 
log 2h 2 logt 2 
+ 
and finally from (27), in view of (18), 
Pi(x, x’) = (x — — o’) 


O3w32 


1 


+ O(Y | Y = xz’. 


¥ tdt 
log ¢ 2 


3. Passinc FroM P,(x, x’) To Po(x, x’) 
Let 6 be a positive number, 6<x, which may be at our disposal until 
later. We have on one hand 


Pi(x+6,2’)— P(x, 2’) =6 0’) + (x +6 —«)(x’ — 
0<w’< 2’ 0<w’< 2’ 
and on the other hand from (33) 
2h log €(Pi(x + 6, x’) — Py(x, x’)) 
zz’ dt 


= ((x + 4)a’(log (% + 6)x’ — 2) — xx’(log xx’ — 2)) 
2 logt 


(zt+8)z’ 


+ (x + 6)x’(log + 4)x’ — 2 


log ¢ 
zz’ t dt 
+ (log (x + 6)x’ — log xx’) 
2 logt 


dt 


3 
+ (log (x + 6)x’ + 2» f bx’(2xx’ + bx’) 
0g 


+ | 
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We give now an estimate of the second sum on the right-hand side in (34), 


zSw< zSu< 
0<w’< 2’ 2’ 


where y» runs through all integers in the assigned rectangle. The integers form- 
ing a point lattice, their number is of the order of the rectangle’s area, plus 
an error arising from the boundaries. But as the boundaries of the rectangle 
of one small side 6 and one long side x’ are relatively long, we have to trans- 
form the rectangle into a more suitable shape. In fact, there are as many in- 
tegers in 


xSu<xti, <x, 
as in 
an’ Sw < (x + < 
and we can choose the exponent / in such a way as to have én! and x’n-' 
of the same order, i.e., 
dn! = O((6x’)'/?), = O((6x')*/?). 
Then the length of the boundary of the rectangle is O((éx’)'/*), and we have 
D1 = + + 1) 
zSu< 
2’ 
and hence 
(36) («+6 — w)(«’ — w’) = O(6%x'*) + + O(5x’). 


zSw< 
0< w< 


Moreover we have 


6 
(37) log (x + 5)x’ — log xx’ = log (1 a =) 
x 


(x + 6)x’ log (x + 6)x’ — xx’ log xx’ 
xx’ (log (x + 6)x’ — log xx’) + 5x’ log (x + 6)x’ 


52x’ 57x’ 
ix’ + ) + log + ) 
x 


x 


57x’ 
= 6x’(1 + log xx’) + o(=), 
x 


6 6? 
x x? 
and 
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dt bx’ 1 1 
zz’ log xx’ log xx’ log (x + 4)x 


5x’ 57x’ 
) 
log xx’ x(log xx’)? 
and similarly 


(2+8)2" ¢ dt (x + 6)x’ sa’ 
f = 6x’ + 
zz’ log ¢ log xx’ log(x+6)x’ log xx’ 


log xx’ log xx’ 
The equations (34) to (40) give, after division by 6 and after due simplifica- 
tions, 


(39) 


(40) 


2h log € Q(x, x’) = 2hloge Do (x’ — w’) 
0<w<z 
0<w’<z 


ze’ dt 
= x'(log xx’ — nf —+—f 


2 logt log ¢ 


— + O(6x"2) + + O(x') 


(log zz’ )i/2 
+ o( ). 


The same process that we applied to x is now to be used with respect to x’. 
Let 5’ be positive, 5’<x’. We have on one hand 


O0<w<z 0<w<z 
0<w’< 2’+8’ 0<w’< 2’ 


Vit DY -w’. 
0<w<z 


D> (x +8’ a’) | = O(8'2x + §/3/2x1/2 + 6’) 
O<u<z 


in analogy to our former argument. From these equations and from (41) we 
deduce on the other hand 
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2hloge-6’ 
0<w<z 
0< w’< 


= ((x’ + 6’)(log x(x’ + 6’) — 1) — x’(log xx’ — 1)) 
2 


z(2’+6’) z(z’+8’) di 
= t log ¢ t 
— x6'(2x’ + 6’) + + O(6'8/2x1/2) + O(6’) + 
+ 4+ O(x") + | 


If we make use of (38), (39), (40) after replacing x, 6 by x’, 6’ and vice versa, 
we get from (42) after division by 5’ and some easy calculations 


(42) 


zz’ dt 

2h log > 1 = log f —— — xx’ + O(6’x) + O(6!/2x1/2) + O(1) 
O<w<z 2 log t 
0<w’< 2’ 


6 §1/24/3/2 x’ (xx’)? 
6’ a’ 56’ 


Now we put 
6 sxe (e/2) (log zz") 6 x’e (¢/4) (log zz’) 


and have then 


1 zz dt 
ie (toe f x2) + Of Clog | 
2h log 2 logt 


0<w’< 


As we find by partial integration 


] Y+1 + O(log Y) 
2 logt 2 (log ¢)? 


we have finally, with c in a new meaning, 


1 zz’ dt 
(43) Po(x, x’) = 1 = log f + | 
2 


2h log (log 


0< w’< 2’ 
which is the special case a=(1) of the theorem on page 76 of my paper “K.” 
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ON DISTRIBUTIONS ADMITTING A SUFFICIENT 
STATISTIC* 


BY 
B. O. KOOPMAN 


Let x be a variate taking on values, determined by chance, on the axis 
of reals, R; and let the frequency function f(@:, - - - , 6,, «) be known, but 
involve vy parameters (61, - - - , 9,) which are unknown, but are confined to a 
known region © of real v-space. That is, we are assuming that if v values of 
(0:,---, are given, 


prob. [x Sx<x+ Ax] 
m - ’ 


-,6,, x) = li 
, Ax—0+ Ax 


f 10. -++,6,, x)dx = 1. 
R 


(1) 


A problem of practical importance is the statistical estimation of the 
parameters (6;,---, @,): Let the result of 2 independent observations of x 
made on the assumption that (0,,---, 6,) is fixed yield the numbers 
%1, °° +, %n,—the “sample”; assuming nothing known a priori concerning the 
position of (@,,---, 6,) in Q, so that Bayes’ formula is inapplicable, how can 
the sample %, - - - , x, be used to secure information regarding (41, - - - , 0,)? 

According to R. A. Fisher} this problem is to be solved by finding v func- 
tions of m arguments, - - , X,) (j=1,--- ,”), such that when the argu- 
ments are replaced by the values in the sample, the resulting values are the 
appropriate values of 6;. The question of how appropriateness is to be de- 
termined has its roots deep in the foundations of the subject, and will not be 
considered here. The function ¢; is called by R. A. Fisher an estimate of 6;, 
or a Statistic. 

Assuming that the statistics ¢; (j=1, - - - , v) exist, a question arises im- 
mediately which, stated intuitively, runs as follows: does the position of the 
single point (¢i(m, ---,%n),---, in Q “contain all the in- 
formation” relative to the position of (@:,---, 0,) “contained in the sam- 
ple” of m numbers (a, when (as is usual) Or is “relevant 
information” lost when 2:,---, x, are discarded and only the numbers 


* Presented to the Society, April 20, 1935; received by the editors March 16, 1935. 
Tt On the mathematical foundations of theoretical statistics, Philosophical Transactions, Royal So- 
ciety of London, (A), vol. 222, pp. 309-368. 
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Xn), +, X,) retained? In the former case, R. A. 
Fisher describes the set of functions ¢; as a set of sufficient statistics. The 
term “relevant information” is used by Fisher in two senses which, to our 
way of thinking, have never been shown to be fully equivalent: in the intui- 
tive sense (suggested by the word but never defined) and in the sense of a cer- 
tain definite integral.* 

The first object of the present paper is to give a simple definition of the 
sufficiency of a statistic which, we feel, expresses the intuitive notion of R. A. 
Fisher and is mathematically equivalent to certain of his formulations of 
this concept, at least under suitable mathematical restrictions.t The second 
object is to prove that the only distributions (of the analytic nature met with 
in practice, at least) which possess a sufficient statistic are of the very special 
exponential type of formula (4) below. 


DEFINITION. The distribution {(6;,---, 0,, x) shall be said to admit the 
system of statistics j(%1,---, v) as sufficient statistics if the 
equations 
(2) $;(%1, Xn) Xn ) 
imply the following identity in (0;,---, 0), (0/,---, 0/) on Q: 


i=1 i=1 


(3) 


--- , 9, xf) IL xf) 


i=1 i=1 


this equation to be interpreted after formal multiplication wherever denominators 
are zero. 


Here, of course, the letters 0, 6’, x, x’, etc., denote variables in the sense 
of classical analysis. 

This definition, like so many definitions of applied mathematics, seeks 
to throw into precise and explicit form an intuitive conception; and in the 
nature of things, its a priori justification is to be sought in an examination 
of its adequacy as a rendering of the intuition in question: in a sort of intro- 
spection. In the present case the intuitive starting point may be illustrated 
in the following example: Suppose that a “random trial” can lead, among 
others, to the mutually exclusive results A or A! of respective a priori proba- 


*R. A. Fisher, Theory of statistical estimation, Proceedings of the Cambridge Philosophical 
Society, vol. 22, pp. 712-714. 

t For a comparative study of these various definitions, cf. a paper by J. L. Doob, entitled 
Statistical estimation, in the present number of these Transactions, pp. 410-421. 
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bilities p(6) and p’(6) (p(0)+p’(@) <1), dependent upon the unknown pa- 
rameter 0, itself having no a priori distribution, as above. Suppose first that 
~(0)/p'(@) is independent of 6: then we should feel that the datum: “A hap- 
pened” tells us nothing about the value of @ which is not contained in the da- 
tum: “A or A! happened.” And, secondly, if, e.g., p(01)/p’ (01) < p(02)/p’ (02), 
the datum “A happened” will, as compared with the less complete datum 
“A or A! happened,” make us feel that 62 is a better guess for the value of 6 
than 6,. This is extended in obvious fashion to the case of v parameters 
(6) =(4,, 6,). 

The present application is obvious: Select arbitrarily the points 
(a1, ---,%,) and in cartesian m-space and describe about them 
mutually exclusive regions Ar and Ar’, respectively. To quantities of higher 
order in general, when the regions are small, 


= Ar TI (6s, % 


i=1 


p'(0) = f(s, --- , 
i=1 

are the a priori probabilities that ” contemplated trials give a sample in Ar 
or in Ar’, respectively. Now p(0)/p’(@) is, apart from the constant factor 
Ar/Ar’, equal to the left-hand member of equation (3). Thus, (3) expresses 
the fact that no information is lost, in the above intuitive sense, when the 
datum “the sample was in Ar (approximately, = (x1, ---, x,))” is replaced 
by the datum “the sample was in either Ar or Ar’”. For a sufficient statistic, 
this must be the case wherever (x, - - - , Xn) and (x/, - - - , x, ) are connected 
by the equation (2). 

In this interpretation we have left out of account values for which 
f(@:, x) =0, or for which the above expressions for the probabiity 
of a point in Ar, Ar’ are not applicable: these are exceptional in the sense that 
there is, at least for the corresponding values of (0, - - - , #,), a set containing 
them all, and such that the probability of having a point in this set is zero. 

It is necessary to emphasize a point here: For example when v=2, in 
saying that (6;, #2) are unknown parameters, it is simply meant that the 
point (@,, 62) has an unknown position in Q, with no a priori distribution. 
But it is perfectly conceivable that when a value of one of the parameters, 6:, 
is given, the other, #2, may be by no means unknown in this complete sense: 
it may either be determined, or have a known a priori distribution, etc., i.e., 
6, and 62 may be statistically dependent. It is shown at once with the aid of 
equations (2) and (3) used in various combinations, that when 0, and @2 are 
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statistically independent, a sufficient condition for the sufficiency of the sta- 
tistics $i, d2 is that (i) when 9, is given, ¢2 be a sufficient statistic for the un- 
known 62, and (ii) when 62 is given, ¢; be a sufficient statistic for 6,. But it is 
perfectly conceivable that f(@:, 62, x) admit the pair of sufficient statistics 
¢1, $2 without admitting a sufficient statistic at all for 6. when the value of 6; 
is given, etc. Finally, f(@:, 2, x) may admit only a single sufficient sta- 
tistic, for 6,, say, dependent or not on 62: this means that the equation 
$1(O2; Xn) =Gi(O2; x/,---, x) implies (3) with =@ (v=2). 

In the following we shall use the notation A XB for the product set of 
two sets A and B, consisting of all pairs of elements (a, 6) where a is an ele- 
ment of A, 6 one of B. And we shall write A*=A XA, the set of pairs (a, a’), 
a and a’ both in A. 

THEOREM I. Let f(0;, - - - , 0,, x) be analytic and not zero at each point of a 
subset 2XR—T of QXR, and let (G=1,--+,v) be continu- 
ous throughout R"; finally, suppose that n>v. 

Then a necessary condition that (gi, --- , dy) form a set of sufficient statis- 
tics for this distribution is that at each given point (a, , a,, 6) of R-T 
a neighborhood Xtra QXR—T exist, where 

Wad: |@; — a;| < h; j yD), 
ra: |x—b| <h, 


such that 


(4) , = Yax+0+x], 


k=1 


where Ox, © are real, single-valued, analytic functions of (0:,-- 9.) im was, 
and X,, X are real, single-valued, analytic functions of x in ra», and where, 
finally, (u=0 means that all the functions are lacking). 

Furthermore, if has the smallest value for which the identity (4) is valid, 
a circumstance which can always be brought about, it must follow that 


n 


i=1 
where V;, is a single-valued function of its v arguments. 


We shall give the proof in the case y=2, which is sufficiently illustrative. 
We shall assume, further, that »=3: For if »>3 we have but to write 
X2, x2, b,---, 6) and, in (3), to take 
= --- =b, to have (3) a consequence of (2) with y=2, n=3. 

Since NV,, is in 2X R—T, f(6:, 92, x) is real, single-valued, analytic, and 


= 
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~0, throughout N,,; and hence (with the real determination of the log) the 
function 
f(r, 62, x) 
ad 
is real, single-valued, and analytic for all (61, 2, 6), 62, x) in the neighbor- 
hood was? Of (a1, d2, a1, a2, b). It follows that on setting successively 
(0! , =(a11, diz), (@21, @22), a2) (all in in equation (3) and then tak- 
ing logarithms, it may be made to yield the system 
61, 82, 61, 92, ) 
i=1 @j2, Xi) i=1 flan, Xi ) 

and each member will be a real, single-valued, and analytic function for all 
points (1, 02, %2, X3) etc. ON War Xai. 

Since by hypothesis ¢1, ¢2 form a system of sufficient statistics, equations 
(5) are a consequence of 
(6) o1(%1, X3) = x2, x3), 
%2, = do(xi, x2, 
Geometrically, this means that, given any (61, 42) of was, the locus (6) on ras° 
through each given point (x7, xz, x3) of this region must be a subset of the 
locus (5) through this same point. Now this circumstance implies the identical 
vanishing of the jacobian of the left-hand members of (5) with respect to «1, 
X3 throughout For suppose that a point (6°, 0°, «°, x°, x?) of 
this neighborhood existed at which the jacobian failed to vanish; a closed 
cubical neighborhood So of (x:°, x:°, x) would exist, lying wholly within 
rq», at no point of which the jacobian vanishes. It would then follow by the 
Implicit Function Theorem that equations (5), when 0;=0), (x1, %2, x3) and 
(xi, x7, xf) are on So, could only be satisfied when (x1, x2, x3) = (xi, x7, x3). 
Hence, 6) would imply (21, x2, x3) =(xi, x2, x2), provided we remain con- 
fined to So. Hence the equations 


= X2, X3), Us = go(%1, X2, X3) 


define a one-one correspondence between S» and Uo, where U, is the range 
of the point (a, u2) defined by these equations as (1, x2, x3) traces out So: 
by the continuity of ¢; and the nature of So, Uo will be bounded and closed 
in the u,#2-plane. But it follows at once under these circumstances that the 
above correspondence is continuous both ways, and thus contradicts the pre- 
servation of dimensionality under homeomorphism. 
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Consider the matrix of the derivatives of the left-hand members of (5) 
with respect to 21, %2, x3. It is found at once to have the form 
92, x5) 
Let us write, further, for the upper left-hand minors, 
M; = || M; = || || 


and let p be the identical rank of M; (the order of the non-identically vanish- 
ing determinant of highest order in M;). We have seen that under the hy- 
pothesis of our theorem, we are confined to the possibilities p=0, 1, 2. It is 
obviously permissible to assume that it is det M, which does not vanish 
identically. 
Case 1; p=0. The equation 
(7) det M; = log on 1) = 


can be integrated with respect to x,, from b to x in r,s, so that 
S(@1, 62, x) S(A1, 62, b) 
og = log ’ 


and on solving this for f(@:, 02, x), the latter is seen to be of the form (4) with 


u=0 and 
S(1, 92, 5) 


f(a11, 412, 5) 
X(x) log f(au, 12, x) ’ 
and these functions are obviously of the required analytic nature. 

Case 2; p=1. Here det M:=0, but det M,#0. This continues to be true 
if the first column of M; is subtracted from the second. After this has been 
done, the symbol 0/0x, may be taken outside the determinant, and the re- 
sulting equation integrated with respect to x; from } to x in w.., obtaining 


62, x) f(au, X) 

log log 

f(au, @12, x) x) 

62, f(an, a12, X2) 

S(@11, X2) f(@21, G22, X2) 
62, b) f(au, a2, 5) 
log ——————_ 
S(411, @12, 5) S(@21, dee, 5) 


a (01, 02, x2) flair, x2) | 


62) = log 


& 
OX f(@11, G12, X2) S(@21, X2) 


OXe 


1936] SUFFICIENT STATISTICS 


If, now, we can find a value for x2 in 7,, such that 


(8) log S(@11, G12, X2) #0, 
(21, G22, X2) 

we can solve the above equation for f(6:, 42, x) explicitly, and establish the 
form (4) for it, with ».=1, and explicit expressions of the required analytic 
character for 01(61, 62), @(6;, 02), Xi(a), X(x). But to assert that no dai, dee, x 
exist for which (8) is true is to require that (7) be an identity, as is seen by 
multiplying (8) through by —1, and then replacing az, 22, x by 41, 02, x, re- 
spectively. But that is Case 1, which we are at present excluding. 

Case 3; p=2. The proof is similar to the above. The first column of M; 
is subtracted from the second and the third, and the symbol 0/dx, taken out; 
then the determinant equation is integrated with respect to x; from } to x 
in 74s, and the resulting equation solved for f(61, 62, «). The explicit form of the 
functions in (4) shows that they have the required analytic character. The 
only case where the equation might fail to be solvable for f(6:, 62, x) is when a 
certain 2-rowed determinant vanishes identically ; but an obvious transforma- 
tion would show that this would imply that det M.=0, which would be Case 1 
or 2, both at present excluded. 

In order to prove the last paragraph of our theorem, we substitute ex- 
pressions (4) (u assumed to be minimal) into (3), take logarithms, etc., thus 
obtaining as the necessary consequence of (2) 


k=1 t=1 


k=1 i=1 

which, being an identity in the 0, 0s, yields, on setting successively 
a) (7=1,---, w), all on w, equations, which 
establish the identity 


> = > 


i=1 t=1 


as a necessary consequence of (2) (in other words, the remaining conclusion 
of our theorem), provided the determinant 


A =| Ax;| =| — Ox(aj,- , air) (k,j ,p) 


is not identically zero. 
Suppose that A=0, and let A’ be the determinant obtained from A by 


405 
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subtracting the first column from each of the others: A’=0. Now the comple- 
ment of the last element in the irst column of A’ is 


(R= = 2,---,p); 
if this is identically zero in the a’s, we should have 


(9) | = 0. 


Leaving this case to one side, suppose the a’s such that this complement is 
not equal to zero. Then the first column of A’ is linearly dependent upon the 
other columns, and since these latter are independent of the 6’s, we have 
u—1 functions T,, whose explicit form shows them to be real, single-valued, 
and analytic on w,,, and u(u—1) real constants C;,., such that 


pol 


s=1 
On substituting these expressions into (4), it becomes evident that it can be 
written with a number less than yu of non-identically vanishing functions 
©.(6:,---, 0,)X.(x), contrary to our assumption. 

Finally, in the case that (9) is true, we replace A by the determinant in (9), 
and after a discussion precisely like the one above, obtain the same result, 
viz., that the value of u can be reduced. 

This completes the proof of Theorem I. 


Corotiary. Jf in the hypothesis of Theorem 1 the requirement that 
oi, , be sufficient statistics is replaced by the requirement that, when values 
Of +, 0, are given, 6:,---, 0, are wholly unknown and have the suffi- 
cient statistics 5;(0\41,-- +, 0; Xn) (G=1,---, where 
<n, then the conclusion remains in force, except that w=, and that the func- 
tions X;., X, Vi. involve 0.41, , 9. 

The proof is obtained just as the proof of Theorem I, except that A is 
used in place of v, and @,,---, 0, 0/,---, treated as were 4,, 


6/,--+, the residual variables 0,41, - - - , 0, being carried, unaltered, in 
all the expressions. 


THEOREM II. A sufficient condition for the sufficiency of the system of sta- 
tistics (G=1,---,v) of the distribution f(6,, - - - , 6,, x) is that, 
if R= R*+R*™* (R*R** =0), 

, 9%, = for all ,0,, x) on 
(10) 


k=1 


for all (0;,- ,0,, x) on R*; 
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(11) (k=1,--+,u) 
for all (0:,- ++, 9, x) on Q R*; 


where Ox, © are real single-valued functions of (0:,--- , 0,) at all points of Q, 
and X;, X, real single-valued functions of x defined almost everywhere on R%*, 
such that, for each (0;,--- , 8,) on Q, equation (1) is valid in the sense of Le- 
besgue, and, finally, V;, is real and single-valued for all values of its v arguments 
on the range of (1, - - , 6») as (%1, , Xn) describes (R*)*. 


It is shown immediately, upon substitution and reduction, that equation 
(2) implies (3), and this formal work is valid for each (@,,---, 9,), 
(0/,---,6/) on Q, and almost every (x;,---,%n), on (R*)". 
This furnishes our proof. 

It may be remarked that a sufficient condition constituting a partial con- 
verse of the corollary to Theorem I in the same sense that Theorem II is a 
partial converse to Theorem I is readily formulated. Furthermore, theorems 
of the above nature are easily obtained in the case where x denotes a point 
of N-space R¥: the multivariate distributions. 

A final remark is that all our results have a form invariant under change of 
parameter, and also under change of variable x, as indeed they should have. 


THEOREM III. Let equations (10) of the hypothesis of Theorem II hold with 
the same conditions imposed on ©;, O, Xx, X as in that theorem; suppose 
further that, for each (x1, - - - , Xn) of (R*)", the expression II, i Xi), 
regarded as a function of (0:,---, 0,), have a unique maximum (6:,--- , 6») 
interior to Q, the above function being differentiable with respect to 6:,--- , 0, at 
(6,,---, 6,) and taking on a positive value at that point; and let the determinant 
|80,/00;|~0 at (6:,---,6,); and suppose , lastly, that the functions 
6;=6,(a1,---, are admitted to constitute a system of statis- 
tics for the estimation of 0; (j=1, - - - ,v); then this system of statistics is a suffi- 
cient one. 


In view of Theorem II, all that it is necessary for us to show is that, under 
the hypothesis of the present theorem, equations (11) are valid with 
;(%1, ---, Xn) defined as 6,(x1, - - - , x,). Now on account of the differentia- 
bility at (6:, - - - , 6,), etc., the conditions for a maximum 


fe] n 
(12) log I] f(@:,---,%, =0 


7 t=1 


must be satisfied when (6,,---, 0,)=(6:,---, 6,); these become, by use of 
(10), 
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kat 00; L int i 

and our task will be accomplished when it is shown that, for a given set of 
values (@;, - - - , 6,), these equations determine the v quantities | 
uniquely. (The existence of at least one set of values for the [ | expressions is 
a direct consequence of the definition of 6;.) Now the uniqueness is an imme- 
diate consequence of the assumption 
| 00; 

a0; 


| 
(6) 


of our hypothesis. 

The statistic (6,, - - - , 6,) in this theorem is called by R. A. Fisher the 
“maximum likelihood” statistic, and Theorem III is our rendering of Fisher’s 
theorem that if a sufficient statistic exist, the maximum likelihood gives such 
a statistic. 

As examples, consider the normal distribution 


1 
(269?) 
2, x) 1 2 


the Cauchy distribution 


(01, 02, 1 
mw + (x — 6)? 


(each with 2: —2 @,>0), and the distribution corresponding 
with Pearson’s Type III curve 


The following facts are observed to be true: The normal distribution admits 
a pair of sufficient statistics for (@:, 62) regarded as unknown, and also one 
for each of 6, 62 when the other is regarded as known. The Cauchy distribu- 
tion does not admit any sufficient statistic for either parameter when the 
other is known, nor a pair for both when they are both unknown. The Pearson 
distribution admits sufficient statistics for 42 or 63; or both when 6; is regarded 
as known, but for no set of parameters involving 6;, no matter what assump- 
tions are made with regard to the knowledge of 42, 43. 
Of these assertions, the positive ones are readily demonstrated by throw- 
ing the frequency function mto the form (4) and taking the maximum likeli- 
hood statistics for ¢;. 


[May 
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In order to prove the negative assertions, we observe that if, in accord- 
ance with Theorem I, the function had the form (4), we should have 


3? # 90, dX; 


log f = 


0x06, k=l 06; dx 


(or, in dealing with the Cauchy distribution when 9, is known, a correspond- 
ing form in 62; here »=1 and the conclusion is immediate). If now we set 
y = —6,, we see that the left-hand member is of the form F(x+), and satisfies 
the functional equation 


F(x + y) = Bi), 


the solution of which (under the present conditions of differentiability) is 
known to be* 


N 
F(u) = Pr(u)er, 
k=l 


where P;(u) is a polynomial, 7, a constant. It is readily verified that this 
form does not apply in the cases to be considered here. 

The same method, and the requirement of finiteness of {*2fdx (from which 
it follows that 7; =0 etc.), lead to the following: 


THEOREM IV. Let f(6:, - - - , 0,, x) satisfy the hypotheses of Theorem I, to- 
gether with the further conditions that 


and that 2X R—T contain all the points of the 6,-axis (for every 02,---, 9). 
Then 


k=1 
where O.=Qx(02, 6,) <0, U=U(h, 6,), V=V(a, 62, 6,). 


* For references and an elegant proof, see P. Stickel, Sulla equazione funzionale f(x+-y) 
=) 1 Xi(x) Vi(y), Atti della Reale Accademia dei Lincei, Rendiconti, vol. 222, pp. 392-393. 
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STATISTICAL ESTIMATION* 


BY 
J. L. DOOB 


Let {x(p)} be a family of chance variables depending on the parameter p. 
Suppose that it is known that a certain chance variable x belongs to the 
family {x(p)}, and suppose that a sample of values of x: x, --- , Xn, has 
been obtained. The problem of estimation is to find a function p(m, - - - , Xn) 
which in some sense is a suitable estimate of the value of » for which x(p) =x. 
Any function p(x, - - - , will be called an estimate. If {p,(a,---, 
are estimates of p for samples of 1, 2, - - - , the sequence of estimates will be 
called a statistic.f A statistic {p,(a1, - - - ,x,)} is called consistent if the proba- 
bility (based on the value po of p) that | - - , tn) <¢ approaches 
1 as n becomes infinite for every positive ¢ and for every value po of the pa- 
rameter in the range considered. It will frequently be true that lim,..f,= po 
with probability 1 for every fo in the range. 

It is evident that if there is one consistent statistic there are infinitely 
many, since if N is any positive integer, the consistency of a statistic is inde- 
pendent of its first V estimates. This shows a fact, sometimes not made clear, 
that the problem of estimation from a sample of m for n fixed throughout the 
discussion has nothing whatsoever to do with the idea of consistency. In the 
problem of estimation from a sample of , what is desired is an estimate 
Pn(%1, +++, Xn) which is close to the value fo of the parameter which deter- 
mines the probabilities, and this closeness should depend as little as possible 
on po. In this paper ideas connected with the estimates given by the method 
of maximum likelihood, developed by R. A. Fisher, will be examined in detail. 

If f(a, - - - , is non-negative and integrable over r-dimensional space, 


can be considered as the probability that a sample (x, - - - , x) of a chance 
variable (whose range is the set of points of r-dimensional space) is in the 
set E, and f(x, - - - , x,) will be called a probability density in r dimensions. 
The value of the above integral will be called the probability of E. We can 


* Presented to the Society, December 30, 1934; received by the editors April 4, 1935. 
Research under a grant-in-aid from the Carnegie Corporation. 
t This differentiation between the terms estimate and statistic is not customary, but allows the 
usual language to be preserved while making the discussion more precise. 


410 


STATISTICAL ESTIMATION 411 


suppose that f is defined at every point of r-dimensional space, defining f as 0 
where it was undefined originally.* Let f(x1, - - - , x-; p) be such a probability 
density in r dimensions, depending on the parameter p. The set of these densi- 
ties for all values of p in the range considered will be called a family of proba- 
bility densities. If f(*1, - - - , x»; p) has the property that there is a domain D 
in the r-dimensional space of (x, - - - , x-) with the property that, for each 
value of p, f>0 on D except possibly for a set of points of Lebesgue measure 
0 and f=0 on the complement of D except possibly for a set of points of 
Lebesgue measure 0, the family will be said to have the property D. Let r=1. 
If pn(*1, ---,%n) is defined as a value of p which maximizes IT... f(x;; p) if 
such a value exists, p, is called the mth maximum likelihood estimate of p. 
The statistic {p,} is called the maximum likelihood statistic. If r>1, the 
definition is the same except that x; becomes a complex of r numbers. 

It has been shown that maximum likelihood statistics {p,} are consistent 
in a very wide class of cases and that in a somewhat less wide class the dis- 
tribution of /?(p,—p) as m becomes infinite approaches the normal dis- 
tribution with mean 0 and variance a”, where 


1 0? lo 
f fis log f dx, dx, 
op? 


ap 


In the case of a family of discrete-valued chance variables: when there is 
an integer NV such that x(p) takes on the values 1, - - - , N with probabilities 
(1; p), ---,f(N; p), and takes on no other values, the maximum likelihood 
estimates are obtained, as above, by maximizing IT), f(x;; p), where the x’s 
are now positive integers less than N +1. The distribution of n!/*(p,—p) as n 
becomes infinite approaches the normal distribution with mean 0 and vari- 
ance o” where 1/o? is the negative of the expectation of 0? log f/dp? or the 
expectation of [0 log f/dp |?. 


(1) 


* Quotients with f in the denominator will frequently appear in the formulas. These quotients 
are defined as 0 when f=0. Integration will always be over all space when the limits of integration 
are not stated explicitly. 

+ J. L. Doob, these Transactions, vol. 36 (1934), pp. 766-775. A proof given by H. Hotelling, 
ibid., vol. 32 (1930), pp. 847-859, holds for discrete-valued chance variables. The study of maximum 
likelhood statistics was started by R. A. Fisher in the Philosophical Transactions of the Royal 
Society of London, (A), vol. 222 (1921), pp. 309-368, and continued in the Proceedings of the Cam- 
bridge Philosophical Society, vol. 22 (1925), pp. 700-725, and in the Proceedings of the Royal 
Society, (A), vol. 144 (1934), pp. 285-307. The work of Fisher’s second paper (which will be referred 
to as Fisher II) is dealt with systematically in this paper, and most of the results in this paper were 
stated in a general way by Fisher. 
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These and other facts suggested to R. A. Fisher that in each case the 
quantity 1/o be called the amount of information in the original family of 
chance variables, or the amount of information in a sample of 1, pertinent to 
estimating p. Since in the continuous case the information is obtained from 
values of x,, - - - , x, and in the discrete case from values of x (which can only 
take on one of N values), the amount of information 1/c? in the two cases 
will be denoted by /(m, ---, x,), I(x), respectively. R. A. Fisher showed* 
using the first integral in (1) that --- ,%,; ¥1,---, y-): the amount of 
information in a sample of 2, i.e., the amount of information in a sample of 1 
from the distribution with density 

in 2r dimensions, is 2/(x1, - - - , x,), and more generally that the amount of 
information in a sample of s is s](x, --- ,x,). The corresponding theorem 
holds for the discrete case. The quantity /(*,---, x,) is a function of p, 
but this function can be made a constant by a suitable transformation to a 
new parameter. 

THeoreM 1. Let {f(x:, -- - p)} be a family of probability densities with 
the property D. We suppose that fp =Of(%1, -- + , x-; p)/Op exists, that 


is finite, and that, if E is any Lebesgue measurable point set in the space of 


= fl de. 
E 


Let {a;(x1,---,x-)},7=1,---,s, be Lebesgue measurable functions, and sup- 
pose that the s-variate distribution of a, --- , a, 1s determined by a probability 
density +, a; p). Then the family has the property D. Suppose 
that - - , p) satisfies the same regularity conditions in (ay, - , &s; p) 
that --- , Xr; p) does in (%1,---, x,; p). Thent 


(4) 


? 


* Fisher II, pp. 709-710. 
{ The notation L.U.B. will be used for “least upper bound.” 
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where E,,---, En are any Lebesgue measurable point sets which are mutually 
exclusive, such that the denominators in the sum are positive, and which have 
the property that there are Borel measurable sets Ei ,---, Ey’ in the s-dimen- 
sional space of o1,--- , a such that E; is the set of points (x1, - ++, X,) corre- 
sponding to the set Ej. 


To simplify the notation we prove the theorem for r=s=1. The proof 
that the family {¢} has the property D is simple and will be omitted. Divide 
the a-axis into mutually exclusive Borel-measurable sets E/,--- , Ey’ hav- 
ing positive probability, which cover the a-axis, except possibly for a set of 
probability 0,* and which have the property that either ¢,20, or ¢,<0, on 
each set. Since the family {¢} has the property D, if ¢(a; p)=0 on a set E’ 
of values of a of positive Lebesgue measure, ¢,(a; ) =0 almost everywhere 
on €’. Then we can apply the inequality of Schwarz to obtain 


N N 2 

p)da | < — 
j=1 Je p)da E;’ 
The first sum can be considered as J(z), where z is the chance variable taking 
on the value j on £/, i.e., z takes on the value j with probability /p,,¢(a; p)da, 
j=1,---, N. If m;is the greatest lower bound of |¢,| /¢ on E/, 


(5) 


N 
(6) dim? p)da I(z) S I(a). 
j=1 E;’ 


The least upper bound of the sum on the left for all possible subdivisions of 
the a-axis is precisely J(a),¢ and is the same as the least upper bound of the 
sum leaving out the restriction that ¢, has only one sign on each set E; . 
The least upper bound of /(z), the first sum in (5), must also be J(a). Let E; 
be the set of points x corresponding to the a-set E/ . Since 


(7) = J fxs pds, = pode, 
i B; 


the first sum in (5) becomes the sum in (4). 

This theorem suggests a more generally applicable definition of amount 
of information. From now on, if the family f(x, - - - , x; p) satisfies the con- 
ditions of Theorem 1, the amount of information in a set of measurable func- 


* The parameter # is considered fixed throughout this discussion. 

tT If ¢(a; p)=1 for 0<a<1 and 0 otherwise, this can be inferred from the definition of the 
Lebesgue integral, and is in fact a definition of integration suggested by Young; cf. E. W. Hobson, 
The Theory of Functions of a Real Variable, 2d edition, vol. 2, pp. 369-373. The general case can be 
treated in a similar way or reduced to this one by a transformation of the a-axis. 
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tions ai(%1,- @s(%1,---, will be defined by the last expres- 
sion in (4). More general definitions can be made, but this generality will 
suffice for the purposes of this paper. 

The following lemma will be needed below. 


Lemma 1. Let {x(p)}, {x:i(p)}, {xo(p)},--- be families of chance vari- 
ables whose ranges are the points of the x-axis. Let P(p, E), Pi(p, EB), 
P.(p, E), - - - be the probabilities that these variables have their values in the set 
E. We suppose that I,, I,™,--- , the amounts of information in the families, 
defined in accordance with the definition just given, are finite-valued, and that 
there is a value po of p such that 
(8) lim P,(po, E) = P(po, £), 


d d 
(9) lim db P,(po, E) dp P(ho, E), 


no 


for all Borel measurable sets E. Then 


(10) lim inf > 7,,. 


If £,,---, Ey is any division of the x-axis into Borel measurable sets 


such that P(p, E;)>0,7=1,---,N, 
d 2 
— P,( po, E; 
= (0 
P,(po, 


d 2 
za} 


me dp 
(12) lim inf I>, 2 ED 
But the sum can be made arbitrarily close to 7,, by choosing Ei,---, Ew 
properly, thus proving (10). 

In the proof of this lemma, (8) and (9) are only needed for sets E of the 
type needed in (12) to have the sum near J,,. Therefore if the distribution of 
x(p) has a continuous density function, f(x; p), such that f,(x; p) is continuous 
and that (2) holds (where r=1), the sum in (12) has least upper bound 


and it is sufficient to know that (8) and (9) are true for intervals. 


(11) 
This implies that 


* 
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THEOREM 2. Let { - - - , p)} be a family of probability densities satis- 
fying the conditions of Theorem 1, and let - ++ , Xr) 
be Lebesgue measurable functions. Then 


and there is equality for a value po of p if and only if a function $(ou, - + - , Os) 
exists such that, with probability 1,* 
(14) 

There are infinitely many functions a(x, - - + ,x,) such that I(a) =I (x1, - ++ , Xr) 
for all values of p. 


The inequality (13) can be obtained by applying Schwarz’s inequality to 
(4). In order to discuss the case of equality, a new expression will be given 
for I(a1, - - - , a). We assume, to simplify the notation, that r=s=1. Let E’ 
be a Borel measurable set of values of a new variable a, and let E’ be the set 
of values of x for which a(x) is in E’. Then, a fixed value of p being assumed 
throughout the discussion, 


(15) OCB) = 


defines a set function Q(£’) on the a-axis. This set function is completely 
additive. Let the measure of E’ be defined as the probability that a(x) is 
in E’, i.e., the probability of E. Then if the measure of EZ’ vanishes, E has 
probability 0. The probability of EZ is 


fa; pdx 


so that f(x; p) must vanish almost everywhere on E. Since this is true for a 
single value of p it must be true for all values of p in the range considered, 
because the family f(x; p) has the property D.} Then it is readily seen that 
f>=0 almost everywhere on £. Thus Q(E£’)=0. Since the vanishing of the 
measure of E’ implies the vanishing of Q(Z’), Q(E’) has a density of distribu- 
tion ¢(a) on the a-axis.{ Writing the integrals in terms of x, 


(16) = f sales paz = f pas. 
E E 


* As elsewhere in this paper, the expression “with probability 1” means, in terms of the measure 
(probability) defined on the space in question, “on a set of measure 1.” 

1 The exceptional set of measure 0 may vary with p. 

t O. Nikodym, Fundamenta Mathematicae, vol. 15 (1930), p. 179. 
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The expression for /(a@) can be rewritten: 


feo [a(x) ]f(x; pdx} 
17 I(a) = L.U.B. 


and therefore by an argument used above, 


(18) (a) = f par. 


If the set Z; is chosen so that ¢[a(x) | does not change sign on it, and if m; is 
the greatest lower bound of |¢[a(x)]| on E;, 


Ha) Lf =X mi f | | pax 


(19) N 
= > mM; p x; p)dx ° 
j=1 


Using the fact that 


I(a) = L.U.B. ms f | p[a(x)]| f(x; pdx 
(20) 
= L.U.B. >> m; f f(x; pdx}, 


j=l 


we shall show that 
(21) (a) = f 


We can suppose the subscripts so chosen that ¢[a(x)]=0 on Ei,---, E,, 
$20 on E,4:,---, Ey. Then (20) becomes 


N 
(22) I(a) = L.U.B.4 mi f — mi f Aa: 
j=1 Ej E; 


j=v+1 


and we know that to find the least upper bound of the sum we must make a 
finer and finer subdivision of the a-axis. By reasoning used before, (22) be- 
comes ¢f, integrated over the set where ¢=0 less —@f, integrated over the 
set where ¢ <0, which proves (21). By Schwarz’s inequality, 


(23) ray s { f | J = 10-108), 


This inequality shows that J(a@)</J(x) and that there can be equality only 


1936] STATISTICAL ESTIMATION 417 


when ¢f'?, f,/f'? are linearly dependent, i.e., when (14) is true, the ratio of 
the constants of dependence being determined by (16). 

Conversely if there is a function ¢;(@) such that, for some value of 9, 
f,/f =¢:(a) with probability 1, (16) becomes 


(16’ ¢:[a(x) pdx = o[a(x) f(x; pdx, 


and since this equation determines ¢, ¢ and ¢; are identical except possibly 
for a set of values of a of probability 0. Equation (18) now shows that 
I(a) =I (x). 

The condition (14) is satisfied if the correspondence between the x and a 
spaces is one-to-one, excluding sets of 0 probability if necessary. To prove 
the last part of the theorem it is thus sufficient to show that there are measur- 
able functions a(x, - -- , which determine a one-to-one correspondence 
between the values of a and the points (x, - - - , x,), excluding a set of values 
of a of probability 0 and a set of points (m4, - - - , x,) of probability 0. If these 
exceptional point sets are of probability 0 for one value of p they will be for 
all values of p, since the family f(*, ---, x,; p) has the property D. Such 
functions have been discussed in detail.* 

If equation (14) can be integrated, it becomes an equation of the form 


An estimate a(x, - - - , x,) which satisfies an equation of this type (for s=1) 
is called a sufficient estimate.t It has been shown by B. O. Koopman that if 
f(a, ---, *,) is of the form g(x; p), and if there is a sufficient estimate 
satisfying certain regularity conditions, g(x; p) must be of the form 


TueoreM 3. Let { f(x; p)} be a family of probability densities, and suppose 
that for each value of the integer r, p) =f(x1, , p) satisfies the 
regularity conditions of Theorem 1. Let {pn} be the maximum likelihood statis- 
tic, and suppose that f(x; p) satisfies regularity conditions insuring that 


p) 

1/2 
where for each value of p, lim,..R,=0 with probability 1.§ Then 


+ Rn, 


(26) (nI(x))'*(pn — p) = 


* Cf. for instance B. Jessen, Acta Mathematica, vol. 63 (1934), pp. 260-263. 

¢ R.A. Fisher II, pp. 712-714. 

t These Transactions, vol. 39 (1936), pp. 399-409. Koopman treats the general case where there 
is more than one parameter. 

§ Such conditions were determined by J. L. Doob, these Transactions, vol. 36 (1934), pp. 770- 
771; cf. equation (47) on p. 773. 
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(27) I(pn) S nI(x), lim pies I(x). 

The amount of information from a sample of is nJ(x), so that, using 
Theorem 2, I(p,) SnJI(x). Lemma 1 will be used to prove the second part of 
the theorem. Let g, =(nJ(x))"/"p,, where p will be supposed fixed throughout 
this discussion. Equation (26) is used* to show that the distribution of 
gn—p(nI(x))*/? for large m is nearly normal, with mean 0 and variance 1: 
ie., that if P(g, Z) is the probability that q, is in the interval E, wsi< gn< ue, 
where g = p(nI(x))"?, 


1 ue 
28 lim P,(q, E) = x, 
( n— 2 1 
This corresponds to equation (8). In order to obtain the equation correspond- 
ing to (9), we evaluate P,(g, E). Let E, be the set of points (m, --- ,2x,) at 
which gn< we. Then 


En 


j=1 


n 


E, j=1 
where 
(mI(x))'? 


The set E, is determined by wi< g+A,+R,< ue and the difference between 
the integral (30) over E, and over E,’ : the set determined by 


(32) Mi <Q +An < pe 


(31) 


is the integral (30) over E,,—E£,,-E,’ less the integral over E,’ —E,-E,'’. Now 
by definition of J(*:,---, x,), and by Fisher’s result stated on page 412, 


+, %n) = nI(x) -f p)dx;. 


j=1 


Therefore 


j=1 


* Ibid., pp. 773-774. 


Dear 
q f f p)dx 
| 


STATISTICAL ESTIMATION 


f Il fx; p)dx;. 


e i=1 


Then if the last integral on the right, the probability of e, is small, the in- 
tegral on the left is also. To show that as m becomes infinite, E,’ can be sub- 
stituted for E, in (30), it is therefore only necessary to show that the proba- 
bilities of E,—E,-E,’ , E,’ —E,-E,' approach 0 with 1/n. The set 
is the set of points satisfying 


(34a) q+AnSur<qtAnt Ra < 
or 
(34b) < qgtAn+ Ra< we SqtAn, 


and E,’ —E,-£,! is the sum of two sets defined analogously. The proofs that 
the probabilities of each of these four sets approach 0 with 1/m are similar, 
and only that for the set determined by (34a) will be given. This set is in- 
cluded in the set e, determined by 


(35) q+AnSur<qtAnt Ry. 
These inequalities can be rewritten 
(36) 0S — gq — An < Ry. 


Let ¢ be a positive number. The part of e, where 41:—g—A,>e has proba- 
bility approaching 0 since lim,..R,=0 with probability 1. The part of e, 
where ui—q — A, Se is included in the set where 
(37) 


Since {f,p(x; p)/f(x; p)}* is integrable, the probability of the set determined 
by (37) approaches 


1 
f er 


as m becomes infinite, by the Laplace-Liapounoff theorem.* We have thus 
shown, since « is arbitrary, that the probability of e, approaches 0 with 1/n, 
and thus that as ” becomes infinite, EZ,’ can be substituted for £Z, in (30). 


* A proof of this theorem, with the required amount of generality, is given by A. Khintchine, 
Ergebnisse der Mathematik, vol. 2, No. 4: Asymptotische Gesetze der Wahrscheinlichkeitsrechnung, 
pp. 1-8. 
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Let Ax, - - - , Aw be a set of real numbers, where 
and let e,” be the set of points where 
Aj S An < G=1,---,N— 2), 


defining e,‘¥— similarly except that equality is allowed on the right. Then 
if p(e,) is the probability of e,, 
n N-1 
E,,’ j=1 j=1 
But, using the Laplace-Liapounoff theorem again, 


1 Aj+1 
(39) lim p(e.”) = f 2d x, 


n— 2 dj 


Then 


d N-1 1 Aj+1 
(40) lim inf P,(q,E) = >); f x, 
q 


na (2) 


dj 
Since the division of the axis determined by \y, - - - , Aw can be made arbi- 


trarily fine, 


d 1 
(41) lim inf — P,(q, E) = f xe~2"/2d x, 


no dg (29)? 


A slight modification of the discussion shows that 


d 1 u2-@ 
42 lim sup — P,.(q, E) S f 
(42) (q, E) ned, 
Hence 
d 1 d 1 
(43) lim — P,(q, E) = —— f = — f e(2-9) 
dq (29)? dq (2m)? J,, 


Lemma 1 can now be applied. If a family of densities is determined by 


g(x, q 


the amount of information obtainable from a sample of 1 is readily seen to be 
1. Therefore by Lemma 1, 


[May 
j 
#, 
id 
1 
| 
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(44) lim inf 7(q,) 2 1. 


2 


The quantity J(g,) refers to the distribution of g,=(nJ(x))”*p, when 
p(nI(x))"? is the parameter. Therefore J(q,) =1(p,)/(nJ(x)), so that (44) be- 
comes lim inf, ../(pn)/n2I(x). This, taken in conjunction with the fact that 
I (Pn) <nI(x), shows the truth of (27). 


CotumBiA UNIVERSITY, 
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NON-SINGULAR MULTILINEAR FORMS AND CERTAIN 
p-WAY MATRIX FACTORIZATIONS* 


BY 
RUFUS OLDENBURGER 


1. Introduction. Let there be given a p-way matrix A=(aj,...:,), 
++, %)=1,---, The operation which takes A into A’=(aj,...:,b:,;,), 
where B=(0;,,,) is a non-singular 2-way matrix and the repeated index in- 
dicates summation, is called a non-singular linear transformation on the in- 
dex i, of A with the matrix B. It is also said to be a non-singular linear 
transformation on A. If a matrix A’ is obtained from A by making non- 
singular linear transformations on the indices of A with matrices having ele- 
ments in a field ¢, then A’ is said to be equivalent in the field ¢ to A. If A 
and A’ are 2-way matrices this is equivalence in the ordinary sense. 

The matrix A is said to be non-singular if A is equivalent in some field ¢, 
where ¢ contains the elements of A, to 6=(6;,...;,), where 6,,...;,=1 for 
=jp=1,--+, m, and 6,...;,=0 if 
(n,---,m). Similarly a p-way multilinear form 

G = (i1,---,tp = i,---+,#) 
is said to be non-singular if G is equivalent under non-singular linear trans- 
formations 


(1) qd) (1) (p) (p) (p) 


Uj, * Vip - 


In chapter I of this paper sets of necessary and sufficient conditions, which 
may be applied in a finite number of steps to a given matrix, are derived for 
a p-way matrix A, and therefore for its associated form G, to be non-singular. 
It is necessary in the treatment to distinguish between the cases} where p=3 
and p=4. Among necessary and sufficient conditions for non-singularity it is 
proved that a matrix A as given above is non-singular if and only if A can be 


* Presented to the Society, March 30 and April 6 and 7, 1934. Abstracts appeared in the Bulletin 
of the American Mathematical Society, vol. 40 (1934), pp. 219, 226, 227, under the numbers 140, 
164, 165 respectively. Received by the editors March 22, 1935. 

{ The treatment of the case p=2 is assumed known. 


422 


to 
n 3, 
(1) (p) 

F= 

i= 
| 

|| 


NON-SINGULAR MULTILINEAR FORMS 423 


“factored” into the form - - - where (cf), - - -, are non-singular 
2-way matrices. 

The factorization property of a non-singular matrix A suggests the more 
general problem of —_— the conditions under which a matrix can be 
written in the form (a a, tp=1,---, m, where (ce?) is 
singular, and (c&),---, (<2) are non-singular.* 

In the Seaneahesitiins mentioned above the index a is summed. Necessary 
and sufficient conditions are also obtained (chapter III) for a matrix 
B=(bai,--.i,) to be of the form (a, a not summed, where the 
3-way (a%i,)» , (a%%;,) are non- if arrayed as 2-way mat- 
rices with a6 as the row ‘sites The method of treatment applies to the case 
where B=(b;,...:,) factors into the form (cf) -- - where a is not 
summed, and (cin, - ++, are non-singular. 

The terminology ‘ak notations used in the ordinary theory of 2-way mat- 
rices are assumed known to the reader. 

The paper is divided as follows: §1, Introduction. Chapter I, Non-singu- 
lar multilinear forms: §2, Definitions; §3, Similar transformations; §4, Pre- 
liminary theorems; §5, Necessary and sufficient conditions for a matrix to 
be non-singular; §6, Note on invariant factors. Chapter II, Factorization of 
p-way matrices into a product of 2-way matrices one of which is singular: 


§7, Introduction; §8, Canonical diagonal 2-way matrices; §9, Necessary and 
sufficient conditions for the equivalence of a set of 2-way matrices to a set 
of diagonal matrices; §10, Necessary and sufficient conditions for the equiva- 
lence of a set of p-way matrices, p=3, to a set of diagonal matrices. Chapter 
III, Factorization of p-way matrices into 3-way matrices: §11, Introduction; 
§12, Factorization into multiple composites. 


CHAPTER I. NON-SINGULAR MULTILINEAR FORMS 


2. Definitions. The number of elements in the range of an index 7 is said 
to be the order of i. Thus if 7 varies over 1, 2,---, m, then 7 is of order n. 
A matrix is said to be of order m if each index is of order m. An ordered set 


* In the paper entitled A new method in the theory of quantics, Journal of Mathematics and Phys- 
ics, vol. 8 (1929), pp. 83-84, Hitchcock shows how a matrix of order associated with a polyadic 
can always be factored into a sum of products of 2-way matrices, i.e., 


In his paper entitled The expression of a tensor or a polyadic as a sum of products, Journal of Mathe- 
matics and Physics, vol. 6 (1927), pp. 164-189, he considers the problem of finding the values of n, p, h 
for which a matrix can be factored as above into a sum of products of 2-way matrices. He solves a 
few special cases, but does not solve the general problem. 


| 
« 
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of indices of a matrix is called a partitiont of indices. Two partitions 71, T2 
are said to be equal if they have the same number of indices and correspond- 
ing indices are of the same order (the first indices “correspond,” the second 
indices “correspond,” etc.). We then write 7;= 7». The product of the orders 
of the indices in a partition is called the order of the partition. An asterisk on 
T, where T denotes a partition, indicates that the indices of T have been 
assigned fixed values. For example, if 7=7jk, and we assign to i, j, k the 
values 2, 4, 3 respectively, we have 7* =243. If 7:=T72 and corresponding 
indices of T, and 7: have been assigned the same fixed values, we write 
= 

Let 7, T2,---, T, be mutually exclusive, exhaustive partitions of the 
indices of a matrix A. The display (ar,...r,) of A is the matrix obtained by 
assigning 7;,- - - , J, as indices to the different directions of an r-space. The 
indices in each partition are assumed for convention to vary from right to left; 
e.g., if i: =1, 2; i2=1, 2,3; T then T varies over the range (i,72) = (11), 
(12), (13), (21), (22), (23). 

The (71, ---, T,) diagonal elements of a matrix A =(a7,...r,) in which 
T,=T:= --- =T,are the elements obtained by letting --- 
A matrix A = (a;,...:,) is said to be a diagonal matrix if its only non-vanishing 
elements are , 7p) diagonal elements. 

A T-layer of A is a minor of A obtained by fixing the partition T in the 
sense that the indices of T are assigned fixed values, and letting the indices 
of A not contained in T vary over their complete ranges. The T-rank of A 
is the number of linearly independent 7-layers of A. A matrix A is said to 
be non-singular on T if the 7-layers of A are linearly independent. If a 
matrix A =(a;;,) is non-singular on ij, and k, then A possesses an inverse 
(A =(Axij) on tj, k, where 


(4; jr) (6; 5"); (a; jxA = 


is a Kronecker delta. (6;;;;) displayed in the form T 
T’ =i'j’, is also a Kronecker delta. 

A 6-matrix on (Ti,---, T,) is the matrix (67,...7,), where 71=72 
=T7, and when eee =T}, while 
otherwise. 

The composite on T of the matrices A =(a,7), B=(br.), where p, a, T are 
partitions, is defined to be the matrix A|7|B=(a,rbr.). The repetition of 
the partition T in the last matrix indicates that the indices in T are summed. 
The matrix A X B=(aj,...;,,b;,...;,) is called the open product of the matrices 
A=(aj,...i,), B=(b;,...j.). A matrix A = (af, i not summed (j 

T Some of the definitions given in this section are given in Composition and rank of n-way 
matrices and multilinear forms, Annals of Mathematics, vol. 35 (1934), pp. 622-657. 


; 
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summed), is said to be the multi ple-com posite on the indices 7, 7 of the matrices 
(ai (aii, 

A matrix B=(b;,...;,) is said to be composed of a matrix A =(aj,...i,), 
i,=1,---, S=1,---, p, bordered by zeros, if for 
je=ts=1,-++, Ms, 5=1,---, p while all other elements of B vanish. 

Let a set of p-way matrices of order be given by B,, - - - , Bn. The char- 
acteristic matrix of B,,---, Bm is the matrix W =(p:Bi+ --- +pnBn) where 
Pi, * » Pm are parameters. 

3. Similar transformations. Non-singular linear transformations on the 
sets of variables x;,%,---,2;,%, of 


n 
qd) (p) 


Ip ? 
ip=l 


which leave F invariant form a similar transformation on F. Two bilinear 
forms and their associated matrices which are equivalent under a similar 


transformation are similar in the sense of Dickson.} Let matrices Ci, - - - , Cp 
of order with elements in a field ¢ be given by (c,),---, respec- 


tively, where these matrices are associated with the transformations 


(1) (1) (1) 
(11) Xj, = » 

(2) (2) (2) 
(12) Vig = 


(p) (Pp) 


Assume that p23, and that (11), (12),---, (1,) leave F invariant, whence 


(1) (p) 
(2) (Cai, Caiy) 6, 


where 6 is defined on page 422. The layer 6, of 6 determined by setting 7; =1 
can be written as 6,=I'C,, where 


(1) (2) (p—2) (p-1) (1) (p—2) (p—1) (1) (2) (p—2) (p-1) 
*** eu Cu Cuda ** * Cn Cni€ni *** Cnt Cn 
(1) (@) (p-2) (p—1) 

*** on 


@) (p—2) (p—1) 
*** Cu Cin 
ql) @) (p—2) (p-1) 
*** Cu 
_@) (p-2) (p—1) 
*** C2 Ci 


@) (p—2) (p—1) @) (p—2) 
Cin * * * Cin In . Cni€nn*** Cnn Cnn 


1 L. E. Dickson, Modern Algebraic Theories, p. 104. 


| 
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T is the partition i273 - - - i,-1. The second-order minors of the &, x columns 
of I are of the form 


a) 
Mex = Cx1 


(2) (p-1) (2) (p—1) 

=] (2) (p-1) (2) (p-1) 

Ctp “ee Cte Cxp Cxe 


K,, is a minor of the display 6=(@.7r) of the matrix M= 
a not summed. Let a matrix M’ be given by (C2, --- C22.) where 
+++, (C252._,) are the reciprocals of (cf), , respectively. Let 


y= rs) be the 2-way display of M’ obtained 7 letting the partition T be 
the index of the rows of y and B =ava; - - - a»_1 the index of the columns of y. 
Evidently 


Hence the rank of @ is n. 

Since C, is non-singular and the i,-rank of 6, is 1, the display T is of rank 
1. Hence for &, x given, the minors M,, of ' vanish. Since @ is of rank m as 
displayed, the minors K;, do not all vanish for given values of &, x, whence 
the products vanish for all values of x. Take Then =0 
for all x1. Similarly take c{! for i=2, ,n. Then =0 fora¥t,. This 
determines a matrix which satisfies (2). All matrices (co? ) which satisfy 
(2) are obtained from C; by arbitrary reordering of the rows and columns of Ci. 

Since - - - =1 when i;=i,= - - - =i, it follows that if C, is a di- 


(3) (p) 
agonal matrix, Con, Co, #0, and 


(p) 1 


Cc => —- 
aa 


Further since ci} when (i,---, ip)¥(a,---,a), taking 
ij= +++ we get =0 when a#+i,. Similarly, co =0 when 
=0 when ati... 


Evidently all of the solutions of (2) can be obtained from the diagonal 


where 
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matrices Ci, - - - , Cp as determined above by simultaneous interchanges* of 
the rows and simultaneous interchanges of the columns of Ci, --- , Cp. We 
have proved 


THEOREM 1. Jf Ci, ---, C, are the p matrices, p=3, of order n associated 
with a similar transformation in a field then C,,-- +, Cy» are diagonal mat- 
rices satisfying the conditionC,=C, Pees cs or matrices obtained from these 
diagonal matrices by simultaneous interchanges of the rows and simultaneous 
interchanges of the columns of these matrices. 


In the case p=2, as is well known, the matrices C,, C2 associated with a 
similar transformation satisfy the property C.=C;~, Cy’ being the transpose 
of Ci, but are not necessarily diagonal. We have here a case where the theory 
for p-way matrices, p=3, is much simpler than that for 2-way matrices. 

By Theorem 1 the canonical} pairs of p-way matrices, where one of the 
matrices is a 6-matrix, can be written down. 

An effect of a similar transformation on a given matrix is stated in 


THEOREM 2. Under a similar transformation in a field } on the indices 
+, tp the (i:,--- , tp) diagonal elements of a p-way matrix A =(aj,...:,), 
p23, are at most rearranged. 


‘ 1 1 1) 
Evidently, = da. --- where a has some value 


between 1 and v. @ in the above relation is assumed not summed. 
4. Preliminary theorems. We shall prove 


THEOREM 3. If a p-way matrix A =(a;,...:,), p23, with elements in a field 
od, is equivalent under non-singular linear transformations in ¢ to a 5-matrix on 
-- +, 4p), the characteristic matrix M +PnQnig---i,) can be 
chosen non-singular for at least one set of values of pi,--- ,pn ind. 


Let 5=(6;,...;,) represent a 5-matrix on (j, - - - and let 6:=(41;,...;,), 
52 = » Let the characteristic matrix - - - 
+pn5,) of b,---, 5, be given by W=(wj,...;,). The i,-layers Ui,---, Un 
of the matrix L =(c{?6,,... jp) obtained after the non-singular linear transfor- 


mation (1:) on x,” of 


are related to 6:,--- , 5, by the relations 


* By a simultaneous interchange of the i and 7 rows of Ci, - - - , Cp is meant the interchange of 
the i and j rows of Ci, - - + , the interchange of the i and j rows of Cp. 

t “Canonical” is used in the same sense as in the ordinary theory of bilinear forms and 2-way 
matrices. See Dickson, op. cit., p. 89 ff. 


(p) 
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(1) 


(1) 
U; C11 61 + Cnl 


U, = 5, - bn. 
The characteristic matrix W’ of U,,---, U, is given by W’=(0,Ui+ - - - 
+o,U,). W’ can be obtained from W by the non-singular linear transforma- 
tion on the p’s given by 


(1) (1) 


Pi = Cn O1 + Cin On, 
(3) 
(1) (1) 
Pn = Can On- 
If --- then W is the non-singular 6-matrix on (je, - - , jp). By 
equations (3), 01, - - - , @, can be so chosen that W’ is non-singular. 
Non-singular linear transformations (12), --- , (1,) on as- 
sociated with LZ correspond to the transition ‘Jone W’=(W.... ip) to W”’ 
where are non- matrices, then 
(2) (p) .(p) 
(55,6 C5 ), 


which is obviously non-singular. This completes the proof of the theorem. 

Theorem 3 gives a necessary condition for the non-singularity of a given 
matrix. If p=3 it is very easy to test a given matrix by means of the theorem. 
For p>3 no simple general method has been found for applying it. 

Let - - - , Tm) denote a set of 2-way matrices of order with ele- 
ments in a field ¢. If the set T is to be equivalent to a set of diagonal matrices 
under similar transformation* in ¢ it is necessary that 7, be equivalent under 
similar transformation in ¢ to a matrix of the form 


D= 
- 
uly 

where 7, -- - , 7, are mutually distinct scalars, and J;, - - - , J, are Kronecker 
deltas. The condition for such equivalence is given in the 

Lemma. A 2-way matrix T, is equivalent under a similar transformation 
in @ to a matrix of type D if and only if the invariant factors of (T,;— XI), where 
I is a Kronecker delta, split up into distinct linear factors in $. 


* For a treatment of the equivalence of 2-way matrices under similar transformations see L. E. 
Dickson, op. cit., p. 89 ff. 
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Let the set J now be reduced to D, T/, - - - , T,, under similar transforma- 
tion in ¢, and let 
f 8 s 8 
Ti Tig 
tee 
where 7;; is of the same order as J; for i=1, - - - , uj. The most general matrix 
X which satisfies the relation X DX— =D is of the form 
x 
ll 0 
X . 
X ) 
where X,, is of the same order as J, for ¢=1, - - - , w. Under similar trans- 
formation with X, X-' the matrix 7,’ goes into 
-1 


8 


8 
XT 11 X np J 


for s=2,---, m. If T/’ are diagonal matrices, Xe igXjg =0 for all s and 
a#B, whence T,=0 for all s and a8. Further, for every o, the matrices 
XuoT2.X5,,°°+, XeeT™,X,,' must be diagonal matrices. We have proved 


oa?) 
THEOREM 4. Let T; of the set T=(T;,---, Tm) of 2-way matrices of order 
n satisfy the Lemma. The set T is equivalent under similar transformations in } 
to a set of diagonal matrices if and only if Ti,=0 for a¥B, oB=1,---, pm, 
s=2,---,m, and the set =,=(T2,, - - - , T™,) is equivalent under similar trans- 
formation in @ to a set of diagonal matrices for every o for which the matrices 
in =, are of order greater than 1. 


The analogue of Theorem 4 for p-way matrices, p= 3, is given in 


THEOREM 5. If p=3, and the p-way matrices T,, - - - , Tm of order n with ele- 
ments in a given field } are equivalent under similar transformation in ¢ to a set 
of diagonal matrices, the matrices T;, - - - , Tm are diagonal matrices. 


By Theorem 2 the (i:, - - - , i) diagonal elements of a matrix T, = (f7,...:,) 
are at most rearranged under similar transformation on 4, - - - , Zp. 

Theorem 5 can be used to test the equivalence of a set T=(7), - - - , Tm) 
of p-way matrices, p23, of order , where 7; is non-singular, to a set of 


¥ 
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diagonal matrices. Under reduction of 7; to 5, where 6 is a p-way 6-matrix 
of order n, the matrices T2,---, Tm go into a set >=(Tn,---, Tm). It is 
evident that the set T is equivalent to a set of diagonal matrices if and only if 
the set = is equivalent to a set of diagonal matrices under similar transforma- 
tion. 

5. Necessary and sufficient conditions for a matrix to be non-singular. 
We have the following factorization property of non-singular matrices. 


THEOREM 6. A p-way matrix A =(ai,.. pz=3, is non- if and 
only if A can be factored into the form (ci -- - cf), where (cx), » Zz 
are non-singular. 


The 6-matrix (6;,...;,) on (j:,---, jp) can be written as (62;,-.-5ai,); 
where (6.;,), -- - , (6a;,) are Kronecker deltas. If A is non-singular, we have 


q) (1) (p) 
(a;,..-4,) => (6;,... Da ig’ ipip) ip = (Cai, Caiy)» 


s_(l 
where (ci), ---, (2 are non-singular 2-way matrices. 
Every matrix which can be written in the form 


(1) (p) 


(4) (Cas; Caiy) (a =i,---, n), 


where the rank of i is m and the ranks of (ca are all equal 


to n, can (regardless of the ranges of i;, - - - , ip) be seduced under elementary 
to 

= (Cai, Caiy) (is 1,--- M;, lp = 2), 
or N bordered by zeros. Our theorems, which will be stated for N instead of 
(4), will therefore hold for more general cases. 

Let E denote the matrix , m;%2, - 

We shall now prove 


THEOREM 7. The matrix E with elements in a field } is factorable into a 
matrix of type N with elements in @ if and only if E is non-singular on i, and 
the i;-layers of E are equivalent in ¢ to a set of diagonal matrices. 


The i,-layers of E must be linearly independent since the z;-rank of N is m 
and this rank is invariant under non-singular linear transformations. 
Let (C22,),---, (C{2,,) be the reciprocals of (cB) respec- 


tively. If E=N, pe 


* For a discussion of elementary transformations see Bécher, Introduction to Higher Algebra, 
p. 55. Elementary transformations leave the factorization property (4) invariant. 
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where (dae;),***, (Saa,) are Kronecker deltas of order . Equation (5) is 
equivalent to the set 


(2) (p) (1) 


The matrices 52a; Saa,), (CO Saa,) are diagonal matrices. 
We have proved that if E is factorable into a matrix N, then the 7,-layers 
of E are linearly independent, and equivalent under non-singular linear trans- 
formations in ¢ to a set of diagonal matrices; the converse is simply proved. 
We have now determined enough conditions to test the non-singularily 
of a given matrix A =(q;,...;,), 1, , The procedure of this 
test is as follows. Determine whether or not the necessary condition of Theo- 
rem 7 concerning the non-singularity of A on 7, is satisfied. If A is non-singu- 
lar on 7;, determine (if possible) whether or not the necessary condition of 
Theorem 3 is satisfied. If so, choose the matrix M mentioned in Theorem 3 
so that it is non-singular. Let the p’s and the i,-layers of A be ordered so 
that p:+0. Let Ai, - - - , A, designate the z,-layers of A. Determine by Theo- 
rems 4 and 5 whether or not the set M, Ao, -- - , An is equivalent to a set of 
diagonal matrices. If not, A is singular.* If the contrary is the case, and p=3 
(similarly for p=4), then there exist non-singular matrices X, Y such that 


XpiAi¥ = D,, = Do,---, XAnY = Dy, 


where D,, -- - , D, are diagonal matrices. Multiplying the last (7—1) equa- 
tions by pe, - - - , pn respectively and subtracting the resulting equations from 
the first, we obtain Xp,AiY =D, —y_.PiDi. The matrix on the right is a di- 
agonal matrix, from which it follows that the set A1, - - - , A, is equivalent to 
a set of diagonal matrices. Theorem 7 is now satisfied, and A is non-singular. 
In certain situations the non-singularity of A is at once evident from Theo- 
rem 6. 

Let Ai, ---, An be the 7,-layers of A =(a;,...;,) with elements in a field 
@ and p=4, and let pi, - - - , pn» be chosen so that the characteristic matrix 
M= (SPA i) is non-singular for p:;~0. Let the transformations which re- 
duce M to a 6-matrix reduce A2,---,A, to Ad,---,A,. Theorems 5 and 
7 imply 


THEOREM 8. The matrix A=(aj,...i,), p24, is non-singular if and only 
if A is non-singular on i, and the matrices Ad, ---,A, are diagonal matrices. 


* A matrix not non-singular is said to be singular. 
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If A =(4;,:,:,) is a 3-way non-singular matrix of order m and Ai,---,An 
are the i,-layers of A, by Theorem 7 there exist non-singular matrices X, Y 
such that 

XAY 

where Q; are diagonal matrices. The matrices X, Y can be obtained from 
Theorem 4 and the theory of bilinear forms. Arrange the matrices Q; to form 
the rows of a 2-way matrix P. Since the i;-rank of A is m, there exists a non- 
singular minor V of P. The matrix V-'P is a 2-way display of a matrix 6, 
where 6 is a 3-way 6-matrix of order n. The matrices V’~', X’, Y (the primes 
here denote transpose) are hence matrices which reduce A to 6 under trans- 
formation on 4;, i2, i; respectively. The matrices of reduction from a p-way 
non-singular matrix, p=4, to a 6-matrix are obtained similarly. 

We define the factorization rank of a matrix A=(a;,...:,) to be the 
minimum value of ¢ for which the matrix A can be written in the form 
Ca), Where (ci), ---, are 2-way matrices. This rank 
is invariant under non-singular linear transformations. The factorization rank 
of a matrix is m if all of the matrices (ci), , (cy) are 
non-singular. 

In another paper* the author has defined certain ranks of a p-way matrix 
which are invariant under non-singular linear transformations. The following 
theorem is easily proved. 


THEOREM 9. A p-way matrix A of order n is non-singular if and only tf all 
of its invariant ranks are equal to n. 


6. Note on invariant factors. The matrix W = (o* p:Bi) used in §2 sug- 
gests the following generalization of ordinary invariant factor theory. Let 
B;,i=1, +--+, m, be square matrices of order n. Let G, be the greatest com- 
mon divisor of the minors of W of the ‘th order, and let Gp>=1. We define the 
ith invariant factor of W to be the quotient G,/G,_:. It is determined up to a 
constant factor. It is assumed in factoring the minors of M to obtain the G,, 
t=0, 1,---, , that the factorization is performed in a given field.| Now 
when B,,---, B, are multiplied by non-singular matrices the quotients 
G,/G,_, are invariant. If 

BY dimBm, 


BS = dimBy + + 


* Composition and rank of n-way matrices and multilinear forms, Annals of Mathematics, vol. 35 
(1934), pp. 625, 633, 634. 

t It is to be noted that for m= 3a minor of W cannot always be factored into distinct linear fac- 
tors in a field. To obtain linear factors it is necessary in general to use the quasi-field of quaternions 
and other generalizations of fields. 


i 
4 
* 
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and the matrix of coefficients is non-singular, then the invariant factors of 
W’ =(a,B/ + - -- +amnB,! ) can be obtained from the invariant factors of W 
by the transformations 


= 41101 + 


Pm = + + AnmAm- 


Hence powers of terms occurring in invariant factors go into like powers un- 
der the transformation from the set Bi,---,B,to Bi,---, Bn. 

The determinant of W in the proof of Theorem 3 where p=3 is the only 
invariant factor of W distinct from unity. This determinant factors into dis- 
tinct linear factors in any field ¢. It follows that if A is the determinant of 
the characteristic matrix of the i-layers of R=(r;;,) and R is equivalent in ¢ 
to a 6-matrix on (i, 7, k), then A factors into distinct linear factors in ¢, and 
is the only invariant factor distinct from a constant. 

An exact generalization of the theory of this section holds for p-way mat- 
rices p=3, where the invariant factors are defined in terms of space determi- 
nants. 


CHAPTER II. FACTORIZATION OF ~-WAY MATRICES INTO A PRODUCT 
OF 2-WAY MATRICES ONE OF WHICH IS SINGULAR 


7. Introduction. By Theorem 7, a matrix A =(aj,...:,); 
tp=1,---, m, can be factored into a matrix A =(c{} - - - 
a=1,---,m, where (cf), - - - , are non-singular on i;, - - , ip respec- 
tively, if and only if the z,-layers of Ai, ---,Amof A are linearly independ- 
ent, and are simultaneously equivalent to a set of diagonal matrices. To com- _ 
plete the treatment of factorizations of the above type where (ce?) is singular 
we must obtain necessary and sufficient conditions for the equivalence of 
A,,:--:, Am to a set of diagonal matrices. We do not assume as in the 
treatment of a non-singular matrix A that there exist values of pi, -- - , pm 
not all zero such that pi14i+ - - - +pnAm is non-singular. 

Using an essentially different technique we derive first (Theorem 10) 
the canonical diagonal 2-way matrices Cj, - - - , C, to which diagonal matrices 
E,,:--, Eq are equivalent under non-singular linear transformations, and 
then obtain necessary and sufficient conditions (Theorems 12, 13) for the 
equivalence of a set of 2-way matrices Ci, --- , Ci, Fiz: to a set of diagonal 
matrices. 

To test the equivalence of a set S=(A:,---, Am) of 2-way matrices to 
a set of diagonal matrices, reduce A; to a canonical diagonal matrix C;. If 
this is not possible (this is very easy to determine), it follows that the set S 


4 
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is not equivalent to a set of diagonal matrices. If A; is equivalent to a matrix 
Ci, reduce A; to C;. The remaining matrices of S are simultaneously trans- 
formed into a set A7,---, Am respectively. Apply Theorems 12 and 13 to 
determine whether or not the pair Ci, A? is equivalent to a pair of canonical 
diagonal matrices Ci, C2. If not, the set S is not equivalent to a set of di- 
agonal matrices. If on the other hand the pair Ci, A? is equivalent to a pair 
Ci, C2, reduce Ci, Ad to Ci, C2. The remaining matrices in the set S are then 
transformed into a set Aj’,---, An’. Now apply Theorems 12 and 13 to 
Ci, C2, Ad’. Continue this process until one finally arrives at a set of canonical 
diagonal matrices C,, - - - , Cm to which the set S is equivalent. 

8. Canonical diagonal 2-way matrices. Adopting the notation used by J. 
Williamson in a recent paper* we shall write the “diagonal block matrix” 


where Ci, - - - , C, are square minors of C, in the form 

Let the letters 7, with superscripts and subscripts denote a Kronecker delta, 
and a parameter respectively. We shall prove 

THEOREM 10. Let S=(Ki, - - - , E,) denote a set of diagonal 2-way matrices 
of order n with elements in a given field @. The set S is equivalent in ¢ to a set 
S’=(Ci,---,C,), where 
(71) C; = [7:, 0], 
(72) C2 = 
and 
(81) St = edi], 


(82) = 0]; 
the p?,-- +, p2 are all distinct, and the matrix S} is of the same order as I}; 
in general any pair in the set S’ is of the form 


(Tint) Sow), 


where 


* Simultaneous reduction of two matrices to triangle form, American Journal of Mathematics, 
vol. 57 (1935), p. 282. 


| 
C= 
- 
| C, 
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i i+1 i+1 i+1 i+1 
(92) S2= [om my+15 Pmytmel 


i+1 


and where the p’s in each matrix Si, --- , Sica are distinct, and the orders of 

the matrices S¥, Sj are equal to the orders of , respectively. 
At the same time we shall prove 


THEOREM 11. The matrices X, Y which satisfy the relations 
=(,,---, XC¥ =C; 
are of the form 


(10). 
X5(i)-1,8 (i)—1 


Xo(i),0¢4) J 


0 


X 


where Xu, -- +, are of the same orders as Ij,---, respectively. 


If E; is of rank r, E, is obviously equivalent to C,, where J; is of rank r. 
It is readily verified that if XC: Y =C,, then 


0 X22 Yo 22 
where X,; is a minor of order r. 


Assume now that £,,---, E; have been reduced to canonical forms 
Ci,---,C;and that X, Y are as given in (10). Let £;,: be denoted by 


i+1 i+1 


[P; °°? » Paws), 


where the minors P;*, - - - , Pit! are of the same orders as J}, - - - , Jaci). Let 
Xivsciyy * » Veciy,sc—1 DE set equal to zero. Then 
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i+1 —1 
Po 


i+1_—1 
XEwY = [XuPi Xu, 
Choose the non-zero minors of X, Y so that 
are in the classical canonical forms* Sj,---, Sigy—1, and choose X.(i).s<i, 
SO that Y is a Kronecker delta bordered by zeros. 
The matrix E£,,, has now been brought into a canonical form of type Cis. 
It is readily verified that the matrices X, Y of (10) which satisfy the rela- 
tion 


are of the form 
, , 
Xiu X i,m 


0 


X mim 


yl 
X 


0 


, 


X 


0 


where m, ;,=2, and Xif, are of the same 
orders as [i*',---, Int Fespectively. Theorems 10 and 11 now 
follow by induction. 

9. Necessary and sufficient conditions for the equivalence of a set of 2- 
way matrices to a set of diagonal matrices. Two-way matrices Fi,---, Fn 
are equivalent under non-singular linear transformations to a set of diagonal 


* This reduction is accomplished by rearranging the diagonal elements of £;,:. Hence the reduc- 
tion of £;,: can be accomplished in any field ¢. 


[May 
X’= | 
. 
| | 
Xiu | 
y’ 0 


1936] NON-SINGULAR MULTILINEAR FORMS 437 


matrices if and only if Fi, - - - , Fn are equivalent to a set of canonical diago- 
nal matrices Ci, - - - , C, as given in Theorem 10. We shall therefore prove 
Theorems 12 and 13 below. 

Write 


( 
An Atom 

F, = (u=i+i1,---,m), 


j 


where Afi, -- , are of the same orders as I#,--- , Tic) re- 
spectively. These last matrices are minors of the matrix C; of Theorem 10. 


THEOREM 12. Let 2=(Ci, -- Ci, Fist, +, Fm) be a set of 2-way matrices 
of order n with elements in a field If A’ (iy =0 for w=i+1,---, m, the 
set = is equivalent in @ to a set of diagonal matrices if and only if the following 
conditions are satisfied: 

(a) Ate=0; a, B=1,---, s(t), and w=it+1,---, m. 

(b) Ait!) .-., A™ are equivalent in for every y in the set y=1,---, 
s(i) —1 to a set of diagonal matrices under similar transformation. 


Let X, Y be given as in (10). We shall denote the matrices F) = XF,Y by 


( 
Bo 
= (u=it+1,---,m), 


u 


where the minors By are of the same orders as Aj, for k, /=1,-- - , s(t). Now 
a=1,---,s(i)—1. If the matrices F/ are diagonal matrices it is necessary 
that Bey for a=1,---, s(t)—1, whence Ab (i) =9, 
a=1,---,s(t)—1. Substituting these results in the formulas for the remain- 
ing elements of F,, we find we must also have At; =0 for a#8; a, B=1,---, 
s(i) —1, and the set of matrices A‘*’, - - - , A", must be equivalent for every 7, 
where y =1, - - - , s(i)—1, to a set of diagonal matrices under similar trans- 
formation. Necessary and sufficient conditions for such equivalence are given 
in Theorem 4. We have proved the necessity of the conditions of Theorem 12. 
They are also evidently sufficient. 

If Aid}. is of rank r’ #0, the matrix F,,, can be reduced under trans- 
formations leaving Cy, - - - , C; invariant to a new matrix Fj; in which 


§ 
| 
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i+1 i+1 


(11) A 6(i),8(i) = 0], 


where Ji¢41)_1 is a Kronecker delta of order r’. Write 


i+1 {1] [2] i+1 
= = ( 


where p=1,---, s(i)—1, and the y have r’ columns while the ss. 


have r’ rows. We can now state 


TueoreM 13. Let the minor Ait), of Fis: be reduced as in (11). The set 
of matrices &' =(Ci, - - - , Ci, Fiz1) is equivalent in a field ¢ to a set of diagonal 
matrices if and only if the following conditions are satisfied: 

2 2 : 

(a) = 1,---,s(@)—1. 

(b) = iy ax~B; Qa, B=1, s(i)—1. 

(c) The invariant factors of the matrices a=1, 
s(t) —1, where the I, are Kronecker deltas, factor into distinct linear factors in . 

The non-singular matrices Which transform 
= -1) 0] into itself, so that X 2th are of the 


form 
<48(i),8(%) 0 Wo ’ s(i),8(t) Var ’ 


where W, is a minor of the same order as ‘1, _,. If the set 2’ is to be equiva- 
lent to a set of diagonal matrices, it must be equivalent to a set Ci, - - - , Ci4:, 
where Si.) =Aid)y. The matrix Si; is a minor of C,,: as in Theorem 10. 
The matrix F;,; must then be equivalent to a matrix C;,, under transforma- 
tions X, Y which leave the set =’’=(Cy, - - - , Ci, Aiiq) invariant. If such 
matrices X, Y exist, then X-', Y-' also leaves invariant. If XF =C.41, 
then Hence the set is equivalent to a set Ci, - - , Cis: if 
and only if there exist matrices X, Y leaving =’ invariant such that 


(12) = 


If there are to be matrices X, Y, Cis; such that (12) is satisfied, it is readily 
seen by equating matrices that the following conditions must be satisfied: 


Wi i41 Wu O 


| 
i+] Wu O i+1 Wu 
(14) = 0 0 = 0 


[May 
{1} 
A 
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p 


for p=1,---,s(i)—1, where Y/ hasr’ rows and X/ has r’ columns. By (13) 
and (14) we have 


2] [2] 
(15) = Api) = (0 = 1,---, — 1). 


Further 


—1, [1] 1, (1) 
Y; WtAgciy,1, = W 11A 


[1] 


(16) 


Substituting (16) in XCi4,Y, and using (15), we get 


(XuS1X ll + 
{1] [1] 


[1] [1] 
A 
i+1 


{1] {1] i+1 
Aj A1,8(i) 


-1 {1] {1] i+1 
i+1 i+1 
A g(i),s(i)-1 A 4(i),8(i) 


To have the above matrix equal to F:,: we must have 


Ass 

Also Ait, whence 41) must each be 
equivalent under similar transformation to a diagonal matrix for every a 
where a=1, - - - , s(i)—1. The necessity of condition (c) of the theorem now 
follows from the lemma, §4. 

We have proved the necessity of the conditions of Theorem 13. The suffi- 
ciency of these conditions is evident. 

10. Necessary and sufficient conditions for the equivalence of a set of 
p-way matrices, p= 3, to a set of diagonal matrices. We shall now state the 
analogues of Theorems 10 and 11 for p23. 
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Let Ci,---, C, be p-way matrices, p=3, of orders m,---, m. respec- 
tively. Let a p-way matrix C of order m+ --- +m, be constructed in a 
p-space by placing the matrices C,, - - - , C, in a non-overlapping fashion on 
the principal diagonal of C so that the principal diagonals of Ci, - - - , C, form 
the principal diagonal of C, and let the elements of C not in these “minors” be 
zero. The matrix C will be denoted, as in the 2-way case, by C= [Cy, - - - ,C.]. 
Let the quantities \, 6 with or without subscripts and superscripts denote a 
p-way non-singular diagonal matrix and a p-way 6-matrix respectively. We 
can now state 


TueoreM 14. Let S=[E,,---, E,] be a set of p-way diagonal matrices, 
p23, of order n with elements in a field ¢. The set S is equivalent in @ to a 
set where 


Cy (61, 0], C2 Ti], 
and 
Si = 0], 0]; 


the matrix S, is of the same order as 5; in general any consecutive pair in the 
set S’ is of the form 


Ca = [A1, 0, Az, 0, , Agcay, O, da, OJ, 


= 0] (q =1,---, o(a)), 


a+l1 


= [Nor ’ 0] (r a(a)), 


a+1 


= 5 0}, 
0], 


and o(a) =2*-!—1, a=2, and the minors QT, RY, --- , Roca), Sa, Ta are 
of the same orders as the corresponding minors dj, 0, - - - , AGca), 9, 5a, 0 of Ca. 


Let i, 7,--- , k,l be the indices of the matrices C;, - - - , C,. As we prove 
Theorem 14 we shall also prove 

THEOREM 15. Let A*=(aj,), C*=(ch,), denote 
matrices which, under non-singular linear transformations with these matrices on 
the indices i,j, - - - , k,l of the matrices in the set S’ =(Cy, - - - , Cq), leave the set 
S’ invariant or at most reorder the elements of the minors \4,m=1, - - - , 7(q), of 
C, independently for each m. The matrices A%, - - - , D* are of the form 


where 
Qa 

\ 
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[ Ais 


Ale 
An 
An 


Ag(q)41,1 


qd q 
Ag(q)+1,3 A o(q)+1,2 


Do(q)+1,1 


where D',=A*%, BY, --- Cty for y=1,---, o(g)+1; Dt,=AG BY ---CE 
for y=1, --- , 0(g); and corresponding minors in the sets (Aj, - , 

spectively; the remaining minors of A‘, - - - , D* are of the same orders as the cor- 
responding zero minors of C ,; also, every minor except the last on the diagonal of 
each matrix A%,--- , D* as written in (17) is a diagonal matrix. The matrices 
A4,---+,D* may also be of the types obtained from those given in (17) by simul- 
taneous interchanges of the rows and simultaneous interchanges of the columns of 
the minors in each of the sets (Ads, ---, Dds); 
a=1,---,o0(q);8=1, 2, these interchanges being made independently for each 
Set. 


If there are u; non-vanishing elements on the diagonal of Fy, it is evident 
at once that E, is equivalent to C, where 4; is of order x}. 

Denote C, by (cZ.....) for every a. If = 
then 


(18) ( =0 


a=1 


B=u!+1,---,n. By (18), QP =0 where 
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1 1 1 1 1 1 


++ 


1 1 
(Bin Cind in) Cut nut n) 


1 1 


1 
1 


1 


Q is a minor of the display ,7,€;A=j, - of the matrix 
J =B'X --- XC'XD". By a lemma proved in another paper* by the author, 
G is non-singular since B', C!,---, D' are non-singular. Hence Q is non- 
singular on its columns, and P=0. Similarly, 


1 1 


1 
be 


1 
Also 

1 

1 1 

a=1 


where 6 isa 6-matrix on (8, y,---,7, €) 
ing Theorem 1 we obtain 


0 


1 
1 1 
Quy ,uy | | 


BY 


* Composition and rank of n-way matrices and multilinear forms, Annals of Mathematics, vol. 35 
(1934), p. 625. 
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| 
} 
or matrices obtained from A’, B',---, D' above by simultaneous inter- 
changes of the first rows and columns of A’, - - - , D'. 
Assume that - -- , E.-1 have been reduced to Ci, -- - , and the 
matrices 
1 


a—1 a-—1 a—l 
= (bj, ),-::,D 


which satisfy the relations 


are of the form 
An 
Ain 
An 
a—1 


0 Axe 


a-l 
Ag(a—1)+1,1 
a-l 
A g(a—1)+1,3 Ag(a—1)+1,2 


a-—1 
Do (a—1)+1,1 
a—1 
Do (a-1)+1,3 Do (a—1)+1,2 


where these matrices satisfy the properties mentioned for A? in Theorem 15. 
It is evident at once that E, can be reduced under transformations with 
A«1,..., D*" to a matrix of type Ca. 

We shall now restrict the matrices - - - , so that 


a—1l_a-—1 


1936) 443 
1 
| du 
D= | . 1 
0 
) 
| 
| 
‘ 
a—1 
Dis | 
| 
| Di | 
: 0 . | 
| 
| | 
| 
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Let the orders of the minors A‘, \3, - - 5a be denoted by uy, uz,---, 
Us (a), Tespectively; and the orders of the zero minors between these 
+ +++ +25 (a). By (19), 


ax8 Oxy 
X=Za+l 


, n, and B=1,---, By (20) where 


a—l a—l 


n) 


a—1 


Since - - , are to be taken non-singular, the minors 


bse. 
Bo (a—1)+1,2 


a—l 
Dn, 


a—1 
Do (a-1)+1,2 ™ 


a-—l 


dn, + dan 


are non-singular. By the lemma of another paper mentioned above,* the 
display 2 = (w,) is non-singular, where 2= B31 X 41,2 and 
A=j--- hl; p=y--+ 765 7,°°°, 7, & &, b=Zatl1,---, m. The 


* Composition and rank of n-way matrices and multilinear forms, Annals of Mathematics, vol. 35 
(1934), p. 625. 


| a—l | 

axe 1,2 . 

= 
f 
—1 
J 
=" 
| a) 
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matrix I, is a minor of 2 consisting of certain rows of 2 and is therefore non- 
singular on its rows. It follows that T',;=0. Similarly 


a a 


a—1 
are zero 
It follows from (19) that 


(21) ( ms 


By (21), where 


2at1, (a) +1) +141 


T3 = 
a—l 


Again, since IT’; is non-singular on its rows, T';=0. Similarly, 
( a—1 a—l “ 
a—1 
#1” 
=| —1 


@ @ 
ds ds a a 1 
Zatl, +1+ +Ug(a) +1) 


a—1 a—1 
a 


Further if 8B, , 


( Zatue(a)+itl 


(22) 

where 6 is a 6-matrix on (8, y, - - - , €). Equations of type (22) have already 

been treated.* 

In view of the above considerations, the matrices As“, - - - , De! which 
satisfy (19) are of the form A+, ---, D* as obtained from (17) by writing 
qg=a, and satisfy the properties of Theorem 15 for these matrices. 

Theorems 14 and 15 follow by induction. 


* See Theorem 1. 


1 a—1 
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We shall now prove an analogue of Theorems 12 and 13. Let Ca1= (sy ---«) 
be a matrix obtained from C, of Theorem 14 by replacing 6, by a zero matrix. 
We have 


THEOREM 16. Let =(Ci,-- + , Ca-1, Ka) be a set of p-way, p=3, matrices 
of order n with elements in a field @, where Ci, - - + , Ca—1 are canonical diagonal 
matrices as given in Theorem 14. If the set = is equivalent in ¢ to a set of diagonal 
matrices, the orders of Nia), 5a and values of the non-vanishing 
elements of Ca: can be chosen so that Ka—Ca is a non-singular matrix of order r, 
where r is the order of 52, or such a matrix bordered by zeros; and conversely. 


Let K.=(ks,....). If = is equivalent to a set of diagonal matrices, there 
exist matrices = (a%,"'), - - - , De-+=(dz~"), leaving Ci, - - - , invari- 
ant, and minors - - , 52 such that 

a a—l_a-—l a—l 
(Ray---e) = bj, die ); 
whence 
uf a a—l a—l ur a a-l a-—l 
A=1 


a 


a a 
A= +1 


A= 


where the u’s and v’s are as defined in the proof of Theorems 14 and 15. At 
once 


a 
“atuala)+1 


K.-Ca® ( > is’). 


A= 


If we wish to test the equivalence of a set Ci, --- , , Km to 
diagonal matrices, we determine whether or not Ci, - - - , Ca-1, Ka is equiva- 
lent to a set Ci,--- , Ca. If so, let Kay, Km gointo Kmun- 
der transformation of the set Ci, - -- , Ca, Ka to Ci, - - - , Ca. The matrices 
++ ,Ca-1, Ka, +++ ,Km are equivalent under non-singular linear transfor- 
mations to diagonal matrices if and only if C,,---, Ca, Kest,**+, Kn are 
equivalent to diagonal matrices. We reapply Theorem 16 to Ci,---, Ca, 
Kii:. This process can be continued until we arrive at a canonical set 
Cy ++ 

In Theorem 14 we obtained canonical forms of p-way diagonal matrices, 
p23, and in Theorem 16 necessary and sufficient conditions for the equiva- 


a—l 
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lence of Ci, - - - , Ca-1, Ka to canonical diagonal matrices. Since a set of mat- 
rices is equivalent to a set of diagonal matrices if and only if it is equivalent 
to a set of canonical diagonal matrices, we have derived necessary and suffi- 
cient conditions for the equivalence of a set of p-way matrices, p23, to a 
set of diagonal matrices. 

Theorem 16 is in general difficult to apply. However, it is given here since 
no better equivalent theorem has been found. 


CHAPTER III. FACTORIZATION OF p-WAY MATRICES INTO 
3-WAY MATRICES 


11. Introduction. Necessary and sufficient conditions for a matrix 
A =(ai,...%,) to be of the form - - where (a{),---, are 
matrices non-singular on the index a, are given in chapter I. In the pres- 
ent chapter necessary and sufficient conditions are obtained for a matrix 
A to be of the form (af? - - - not summed, summed, 
where the 3-way matrices (a{}),---, (ai?) are non-singular on ij. The 
method of treatment covers the case where the index 7 in the matrices above 
does not occur* and the matrices (a{?),---, (aif) are non-singular on 7. 

In this chapter, as in the others, = may be partitions consisting of 
more than one index. 

12. Factorization into multiple composites. Let 6:,---, 5n-1 designate 
= Tespectively, where T =1,7;, T’ =i2j2; (5;,;,) is a Kronecker 
delta of order #, and (61:,:,), (&m-1,:,:.) are the i-layers of a 5-matrix 
on (i, z;, 72) of order m obtained by setting i=1,---, m—1 respectively. 
(8m—2,i,i,) are the i-layers of the 6-matrix (6;:,:,) on (é, i2) 
of order (m—1) on each index obtained by setting 7=1, - - - , m—2 respec- 
tively. 

Let Bi, - - - , Bus be a set of 2-way matrices of order nm = mi with elements 
in a field @. If the matrices Bi, - - - , Bn: are equivalent under similar trans- 
formation in ¢ to 61, - - - , dni, it is evidently necessary that the matrices B; 
be each equivalent under similar transformation in ¢ to 6;fori7=1,---,m—1. 
Now B, is equivalent under similar transformation to 6; if and only if (B:—AJ) 
has the same invariant factorst as (6,—XJ), where J is a Kronecker delta. 
Assume that B, is equivalent to 6, as demanded. By reduction of B, to 6 


* The matrix A in this case is a generalization of the Scott product of two matrices. See M. Lecat, 
Abrégé de la Théorie des Déterminants a n Dimensions, 1911, Introduction, p. xl. 
+ Dickson, Modern Algebraic Theories, p. 104. 
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under similar transformation, the matrices Bz,---, Bns are transformed 
. 
into a set B/,---, Bu. Let 


By22 


where B,,; is a minor of order ¢. 


THEOREM 17. The matrices B,, - - -, Bn-1 of order n with elements in a field 
are equivalent under similar transformation in ¢ to a set 5;, - - - , 5m. if and only 
if 

Biu = Bin = Biz =0 (s=2,---,m-—1), 


, , . . 
and the set Boo,B322, --- , Bm-1,22 is equivalent under similar transformation 


in to the set of (n—t)-order matrices 53, 53, +--+, 
The matrix W which satisfies the relation W5,W-'= 5, is of the form* 


W 0 
W = ( ), 
O We 
where W,,; is a square minor of W of order ¢t. Now for s=2,---, m—1, 


= ( 


Equating WB; W- above to 6, we obtain the conditions of Theorem 
17. The matrices 67, 5f,---, 5n-1 form an array like 6;, 52,---, 
whence the above process may be reapplied to Bus, -++, and 
5, 53, +--+ , bus. Since m is finite this process is a terminating one. 

Let £;,7=1, - - - ,2—1, now represent a diagonal matrix with the 7th ele- 
ment on the diagonal as the only non-vanishing element. For square 2-way 
matrices B,, - - , of order to be equivalent under similar transforma- 
tion to the set &,---, &,-1 it is in particular necessary that (B:—)J) have 
the invariant factors \(A—1), A, - - - , A. Assuming that this condition is satis- 
fied, let the set B,, - -- , Bn. be reduced under similar transformation in @ 
to a set BY, --- , B,_1, where we write 


bin 

#21 122 

where 6/1; is a single element. Letting n = mt, t=1 in Theorem 17, we have the 


* Turnbull and Aitken, Introduction to the Theory of Canonical Matrices, 1932, p. 146. 
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The 2-way matrices By, - - , of order n are equivalent un- 
der similar transformation in@ to &, - - , En-1, if and only if bi, = Bie = Bix =0, 
i=2,3,---,n—1, and the set Broz, - , is equivalent under simi- 
lar transformation in to the set Ef, where 
are (n—1) by (n—1) diagonal matrices which possess the (i—1)st diagonal ele- 
ment as the only non-vanishing element, it being unity. 


Let 61, 52,---, now denote the matrices (61:,-..:,) 
is a 6-matrix on (j1,---, jp) of order and (61,:,-.-:,), » 
are i-layers of a 6-matrix on (i, i, ---i,) of order m obtained by setting 
i=1,---,m-—1 respectively. Let Bi, --- , Bn: be p-way matrices, p=3, of 
order n, where n=mt, and where the matrix B,=(b;,...%,) is non-singular. 
Let the matrices C,=(cf;;,); s=1,---, p; Re=1,---, m, m; 
je=1, #, non-singular on k,, i,j, (i,j, is a single partition), reduce B, to 6 
under similar transformation, where 6 is a 6-matrix on (i:/1, - , ipjp)} 
let these matrices simultaneously reduce Bz,---, Bn. to BY,---, Bui. 
With these notations we can state . 


THEOREM 18. The set of p-way matrices B,,---, Bn, p23, of order n 
is equivalent in under similar transformation on ki,--- , kp with Cy= 


toa set if and only if B] =6, for s=2,+--, 
m—1 or Bi' =6, where Bs',---, Bn—1 are obtained from Bi,---, Buia by 
simultaneously rearranging the (isj:, i2j2,-++, tpjp) diagonal elements of 


It has been shown in Theorem 2 that under similar transformation on the 
partitions 71/1, ieje, - , the - - , diagonal elements of a 
matrix are at most rearranged, whence Theorem 18 follows. 

Let &, - - - , now denote the i-layers of a 5-matrix on (i, ki, --- , kp) 
of order n. Let matrices B, = S=1, , be of order with 
elements in a field ¢. We have the 


Coro.iary. The set of p-way matrices Bi,---, Bri, p23, of order n is 
equivalent under similar transformation in @ on i1,--+, tp with matrices 
(cit,) to if and only if for i=1, n—1 
or Bi! =£;,i=1,---,n—1, where Bi’, ---, are obtained by simultane- 
ously reordering the (i:,--- , tp) diagonal elements of Bi,--- , Bn-1. 


We now prove 


. 
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THEOREM 19. Given a matrix B=(bix,...x,) with elements in a field @ where 
Let B,,--- , By, be the i-layers 
of B obtained by letting i=1,---, m respectively. Let 6 be a 5-matrix on 
(ji, Jp) of order t, and 5{',---, bm’ be the i-layers of a 5-matrix 5’’ on 
(i, i1,- ++, tp) of order m obtained by letting i=1,---, m respectively. The 
matrix B is the multiple composite on i, j of matrices A,=(ai;x,), S=1,---, D, 
non-singular on ij, k,, with elements in o if and only if there exist matrices 
C, = (cj,;,;,) non-singular on k,, i,j, with elements in such that 


X Cp = XG, 


Bm| ++ kp| C1 X¥C2X +++ XCp = XB. 


If - - - af), i not summed, where A,=(aj;,);i=i,=1, +--+, m; 
j=j.=1, hi, k,=1, aes then 


If the matrix B=(b;x,...%,) =C, then by (23) 
B| ki---kp| +--+ X AP? 


where Az! =(A,x,;,/;,-) is the inverse of the matrix A, on i,j,, k, and (8;,;,:,"3,"), 
displayed as (57,7:), T,=i.j., T/ j/ , is a Kronecker delta. 
The author has shown* that 


where (4;,;,), (5;,;,,) are Kronecker deltas. By (24) and (25) we have 
(26) Ar? X +++ X Ap? = 8" XS. 


(24) 


Equation (26) is equivalent to the set of equations 


(27) 
ki--+kp| 


where 6/’,---, 5m’ are the z-layers of 5’’. This proves the theorem. 

To show that the factorization property of Theorem 19 can be recognized 
by Theorems 17 and 18, we note that if the 7-layers Bi, - - - , B, of Bare equiva- 
lent under non-singular linear transformations to 6= 6{’ 6, 


* Composition and rank of n-way matrices and multilinear forms, Annals of Mathematics, vol. 35 
(1934), p. 629. 


| 
B,| ky - kp| Ar xX A;! = 6,’ xX 4, 
X = bn! X 6, 
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as in (27), then B=B,+ --- +Bn, Bi, - , are equivalent under non- 
singular linear transformations to 6=(6/’ K6+ ---+6mn’ X6), , 
where 61, - - - , 5n-1 are defined as in Theorem 18, and conversely. Now 6 is a 
5-matrix on the partitions (7:71, - - - , ipjp). If B is non-singular, reduce B to 6 
under non-singular linear transformations. Simultaneously B,,---, Bn 
transform into matrices B/,---, B,/1. The matrices B, Bi, ---, Bn—1 are 
equivalent to 6, 6:,---, 5n-1 if and only if B/,---, B,_s are equivalent 
under similar transformation to 6, - - - , 5x1. The conditions for such equiva- 
lence for the 2-way case are given in Theorem 17 and for the p-way case, p23, 
in Theorem 18. 


Coroxiary. Let &,---, &, be the i-layers of a 5-matrix on (i, t1, , ip) 
of order n, where these layers are obtained by letting i=1, - - - ,n respectively. The 
matrix B=(biz,...%,) of order n with elements in a field @ can be written in 
the form (a? aye’), i not summed, where (ay), (ap?) are non-singu- 
lar and possess elements in $, if and only if the i-layers B,=(by,...x,), 
Be = (bax, * » of B are equivalent in to &, , En. 


Consider the matrix B=(bjx,.-.%,), p23, of order m on i, and order 
n=mton k,---, ky. Let 61, ---, dn-1 be again defined as in Theorem 18. If 
B= is non-singular, the matrices E, = (¢i,;,x,), non-singular on 
k,,iejs, Which reduce B to a 6-matrix on (i:j1, - - - 
jiy***,Jp=i1,---, t, reduce the z-layers B,,---, Bn. of B, obtained by 
setting i=1,--- ,m-—1 respectively, toaset B/, --- , Theorems 2, 18, 
and 19 imply 

THEOREM 20. For B=(bjx,..-%,), p23, of order m on i and n=mt on 
ki, +++ , Rp, to be the multiple composite on i, j of the matrices A,=(aijx,), 
s=1,---, p, non-singular on ij, k,, it is necessary that the sum B of thei-layers 
of B be non-singular. It is further necessary that 


Bi Bn-1 = 


Bi? = Ba-1 = 


where B{',---, By, are obtained from Bi, , by simultaneous rear- 
rangements of the , diagonal elements of B{,---, These 
conditions are also sufficient. 


Now let B represent the matrix (0;:,..-%,), 23, of order m. Let the i- 
layers of B obtained by setting i=1, --- , m be denoted by B,,--- , B, re- 
spectively. If is non-singular, reduce B, Bi, --- , Baa by 
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means of non-singular linear transformations to £, By, - - - , B14, where fisa 
5-matrix on ---, of order m. As in the corollary of Theorem 18 let 
&, be the z-layers of a 6-matrix on (i, - - - , 7») of order We have 
the following 


Corotiary. For B=(bix,-.-%,), p23, of order n with elements in a 
field to be factorable into the form K= (ay. tae aye), i not summed, where 
(aye), ve (aj) are non-singular with elements in , it is necessary that 
B be non-singular. It is further necessary that B!=&; s=1,---, n—1, 
or Bl’=&; s=1,---,m—1, where Bi',---, Bu. are obtained from 
Bi,---, Bj by simultaneous rearrangements of the (i:,---, diagonal 
elements of BY, -- - , These conditions are also sufficient. 


Theorems 17 to 20 may be extended at once to the multiple composite 
on i, j of where Ai,---, A, are non-singular 
on ij only. 

The matrix 5’’X6 above is a canonical matrix of a class of (p+1)-way 
matrices. We have hence determined necessary and sufficient conditions for 
the equivalence of a (p+1)-way matrix to such a diagonal matrix under non- 
singular linear transformations on all but one index. 


APPENDIX 


For certain situations, the equivalence under similar transformation in a 
given field of a set of 2-way matrices S=(Ti, --- , Tn) of order to a set of 
diagonal matrices can be recognized by the rank of a matrix H associated 
with S. This appendix is devoted to the derivation of H. 

Let 


be a classical canonical matrix} to which 7; is similar for i=1, - -- , m. 
Let a;, i=1,---,m, denote the matrix 


Ti —d}I 


where 7/,---, 7» are the transposes of 7i,---, 7, and dj,---, dy are 
permutations of cy, ---, for o=1,---, m. I is a Kronecker delta. As- 
sume that the matrices a;,i=1,- ~~, n, are all of rank n—1. Then there exist 


iinc;‘ isa superscript. 


= 
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values Pi, Wi2, Of Yi and a value of &; such that the minor 
Au,i=1,- - obtained from a; by deletingthe Path, Pith, - - 
rows and the £,th column is a non-singular minor of order n—1. Let Azi, 
i=1,---,m, be the column of a; with the Path, Pith, - - - 
elements deleted. Let Hu,---, Hin, Hu, ---, Hon be defined by the rela- 


where H;; is a column composed of the first £;—1 elements of the column 
—Aj,7'A2;, and H2; is a column composed of the remaining elements of 
—A,;'A2;. Let H now denote the matrix 


Ay Ai, 
(: ) 
Hey 


where the unit element in the ith column of H occurs in the é,th row for 
a=1,---,m. 

Evidently the set S is equivalent under similar transformation to a set 
of diagonal matrices if and only if the set S is equivalent under similar trans- 
formation to the set S’=(Ci,---, Cm) or S*=(C¥,---, Ca®), where 

*,--+, Cn® are matrices obtained from Ci,---, Cn by arbitrary inter- 
changes of the diagonal elements. If the set S is to be equivalent under similar 
transformation to the set S’, there exists a non-singular matrix X = (x;;) such 
that 


XTX = Ci, XTmX = Cn, 
or what is the same thing, 
(1) XT; = CiX, AT» = 


Equations (1) are equivalent to the set of equations 


(21) 


-1 

|| ( 0, 

(2n) = 0. 
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Since a is of rank »—1, equation (2,) can be written as 


Ao1%1,¢,- 


-1 
= — 


Since a2,---, a, are all of rank m—1, we obtain similar solutions from 
(22),--+, (2,) of the’form 


X21 


on§ -1 
= A A29%2,¢,, = — Ain A anXn, 


Xeon 
The matrix X can now be written as 


xX Xn,tn 


HonXn,t, 


Evidently 


X 


whence X can be taken non-singular if and only if H is non-singular. 


(May 
= 
Xin 
Now 
| 
( 
rn 
—) 
0 
m=H . 
0 
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A like argument holds if the set S’ is replaced in the equations of (1) by S*. 
We have proved the 

THEOREM. The set S is equivalent under similar transformation to a set of 
diagonal matrices if and only if H is non-singular for at least one choice of the 
quantities 
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THE EXISTENCE OF AN EXTREMUM 
IN PROBLEMS OF MAYER* 
BY 
LAWRENCE M. GRAVES 
The problem of Mayer with variable end points as stated by Blissf is that 


of finding in a class of arcs 
= 


satisfying a system of differential equations and end conditions 


b, y(a), y(b)] = 0, 


one which minimizes a function g[a, b, y(a), y(b)]. All simple integral prob- 
lems of the calculus of variations for which a fairly complete theory of the 
relative extremum problem has been developed may be transformed to a 
problem of the above type. The theorems we shall prove concerning the 
existence of an absolute minimum may be translated directly into corre- 
sponding existence theorems for the problem of Bolza or for the problem of 
Lagrange. The problem of Mayer is considered here because it seems nota- 
tionally simpler than the problem of Bolza. 

An existence theorem for a complicated problem such as we are consider- 
ing must naturally impose rather severe restrictions on the differential equa- 
tions involved. In order to treat as wide a variety of cases as possible, the 
variable functions are divided into groups satisfying different types of condi- 
tions. We shall divide them notationally into two groups. The functions y,(x), 
defined for a<x J, will be regarded as independent functions, and the curve 
they determine in xy-space will be denoted by C. When the functions y;(x) 
are absolutely continuous, dependent functions z,(«) will be determined by 
“differential equations” and initial conditions of the special form 


(1) Ze = h(x, ¥, 9,2), = Se. 

A solution of these equations, when existent, is a function of the curve C 
and the initial values ¢, and will be denoted by z,=z.[x; C, ¢]. Other types 
of equations and inequalities affecting the functions y,(«) only are introduced 
in §4. The principal change these additional conditions make is to permit a 
weakening of the hypotheses. 


* Presented to the Society, April 15, 1933; received by the editors September 3, 1935. 

1 The problem of Mayer with variable end points, these Transactions, vol. 19 (1918), pp. 305-314. 

¢ Compare Graves, On the existence of the absolute minimum in problems of Lagrange, Bulletin 
of the American Mathematical Society, vol. 39 (1933), pp. 101-104. 
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The final theorems state the existence of a minimum for a function 
I(C, ¢) =gla, 6, y(a), y(b), ¢, 2(b)] in a quasi-closed class of admissible ele- 
ments (C, ¢). The end conditions do not enter the theorems explicitly. But 
the nature of a quasi-closed class implies that in general the end conditions 
y,=0 do not involve the end values z(d). 

The proof in §1 of the lower semi-continuity of the functionals z[x; C, ¢] 
defined by the equations (1) was suggested to me by McShane. This sugges- 
tion makes possible the elimination of a Lipschitz condition, and other simpli- 
fications. My original proof was more direct, but depended on a type of Lip- 
schitz condition, and a modification of the method of successive approxima- 
tions. The remainder of the paper is merely the result of fitting the various 
hypotheses together so as to make the machine work while covering as many 
cases as possible. A fundamental case is taken up first in §§2 and 3. Various 
generalizations are considered in §4. Mayer problems in parametric form are 
considered in Part II. 

A special case of the theorems of Part II, involving the existence of a 
minimum for a function of integrals, has been treated by Gillespie.* Other 
special cases of the theorems below have been given by Mania.t He assumes 
that the equations (1) have the special form 


2a = y, 21, °° * 520), 


where each function /, is monotone increasing with respect to a, --- , 2-1, 
and monotone decreasing with respect to z,. The present discussion includes 
also systems of equations of the more general form 


= h,(x, y, y’, 21,°** Se), 


where each function /, is monotone increasing with respect to each of its 
arguments z,. As Mania has noted, certain simple types of isoperimetric 
problems are included as special cases. On the other hand, the more difficult 
isoperimetric problems in parametric form treated by Tonellif and by Mc- 
Shane§ are not implied by the theorems of this paper. 

For a systematic presentation of certain fundamental concepts used, the 


* On functions of integrals, Proceedings of the Edinburgh Mathematical Society, (2), vol. 3 
(1932), pp. 87-98. 

{ Esistenza dell’estremo assoluto in un classico problema di Mayer, Annali della Reale Scuola 
Normale Superiore di Pisa, (2), vol. 2 (1933), pp. 343-354; Sul problema di Mayer, Rendiconti dei 
Lincei, (6), vol. 18 (1933), pp. 358-365; Sui problemi di Lagrange e di Mayer, Rendiconti del Circolo 
Matematico di Palermo, vol. 58 (1934), pp. 285-310. 

} Fondamenti di Calcolo delle Variazioni, vol. II, pp. 468-482. 

§ Semi-continuity in the calculus of variations, and absolute minima for isoperimetric problems, 
Dissertation, Chicago, 1930. Published in Contributions to the Calculus of Variations, 1930, Chicago, 
pp. 195-243. 
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reader is referred to Tonelli’s Fondamenti, vols. I and II, to a previous paper 
by the author,{ and for the parametric problem to the dissertation by Mc- 
Shane already referred to. 


Part I. PROBLEMS IN NON-PARAMETRIC FORM 


1. Lower semi-continuity of solutions of differential equations. We shall 
use the letters w, , ¢, y, 2, w, A and their corresponding capitals as multi- 
partite symbols, that is, y shall stand for -- - , yx), and z for - , 2). 
We shall also use the summation convention of tensor analysis. The sub- 
scripts 7, 7 will usually range from 1 to k, while a, p will usually range from 
1 to s. The appropriate ranges for subscripts will be indicated in other cases 
so far as necessary by the context. The symbols ||¥]I, ||z\|, etc., will be used 
to denote the maximum |y,|, the maximum | z,| , etc. We shall be interested in 
certain special existence theorems for solutions of equations of the form 


(2) Zo = + z)dx. 


We shall suppose for Theorems 1 and 2 that the functions h,(x, 2) are defined 
and non-negative for a<x<b and for all 2, and that they are measurable on 
asx<b for fixed z, and continuous in z for fixed x. 


THEOREM 1. Suppose the functions h, are motonone increasing with respect 
to each argument z,. Then if there exists a set of bounded measurable functions 
2 (x) such that 


= te + f “helx, 


the equations (2) have a uniquely determined least solution z,(x) S2,° (x) on (a, 6). 


Let 


2f(x) = + f nets, 2"—")dx. 


Then the sequence z} (x) is monotone decreasing with respect to m, and hence 
converges to a solution 2* of equations (2) with z*(x) <z,° (x). If there are two 
solutions z' and 2”, let z denote the logical product of 2’ and 2”, i.e., for each ¢ 
and x, 2,(x) =lesser of 22 (x) and 22 (x). Then 


2 te + f “hela, 2)dx, 


1 Graves, On the existence of the absolute minimum in space problems of the calculus of variations, 
Annals of Mathematics, vol. 28 (1927), pp. 153-170. This paper will henceforth be referred to as 
“Annals.” 
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so that by the above there is a solution 2/7 («) <2,(x). The same reasoning is 
applicable to any finite number or to a denumerable infinity of solutions. 
Now let 2,"*(x) (n=1, 2,--- ;p=1,---, 5) be s sequences of solutions such 
that lim,..2,"°(b) =B,, where B, is the greatest lower bound of the values 
z,(b) for all possible solutions z(x).. Then there is a solution z(x) with 
z-(x) <2,"°(x) and hence z,(b) =B,. It is readily verified that there is only one 
solution satisfying the equations z,(b) = B,, and that it is the least solution of 
equations (2). 

THEOREM 2. Suppose there is only one equation (2) and let the function 
h(x, 2) be monotone decreasing with respect to z. Suppose that h(x, ¢) is integra- 
ble. Then the equation (2) has a unique solution on the interval (a, b). 


Let H(x, z)=h(x, for and H(x, z)=h(x, ¢) for z<¢. Then 
0<A (x, z)<h(x, ¢), so that by a theorem of Carathéodoryf there is a solution 
z(x) of the equation 


+f H(x, z)dx 


on the interval (a, b). Since-z(x) =¢, this function is also a solution of equation 
(2). If there were two solutions z'(x) and 27(x), there would be an interval 
c<x<d on which one solution is greater, say z'(x)>z?(x), and such that 
z'(c) =27(c). Then 


= 2'(c) + f S 2°(c) + f 2*)dx = 27(x), 


which is a contradiction. 
Let A be a bounded closed set in xy-space. A curve 


Cc: = 


will be called admissible provided its points (x, y(x)) lie in A and its functions 
y:(x) are absolutely continuous. The curves C of a class K of absolutely con- 
tinuous curves are said to be equally absolutely continuoust in case for every 
positive number « there is a positive number 6 such that for every curve C 
in K, =||-y(an) —¥(8,)||<e for every set of non-overlapping intervals (an, Bn) 
whose length-sum is less than 6. In comparing two curves 

t Vorlesungen iiber reelle Funktionen, p. 672. The existence theorem applies even when there are 
several equations (2), but the uniqueness apparently does not. Other theorems on the existence of 
maximal and minimal solutions have been given by Kamke, Acta Mathematica, vol. 58 (1932), 
pp. 57-85. 


t See Vitali, Sull’integrazione per serie, Rendiconti del Circolo Matematico di Palermo, vol. 23 
(1907), p. 139; also Annals, pp. 156-157. 


(a<«<b) 


460 L. M. GRAVES [May 
Ci: = (a, )), Co: yi = yoi(x) (a2 S x S be), 


or two functions 2;,(x) and 22,(x), it is frequently convenient to extend the 
range of definition of the functions involved so that they will be continuous 
on the whole x-axis, by assigning constant values to the functions outside of 
the original interval of definition. With this understanding, the distancet 
|C:, C2|| of two curves is defined to be the maximum of the quantities 

Let h,(x, y, y’, z) be a set of functions defined and continuous together 
with their partial derivatives ,,; for (x, y) in A and for all y’ and all z. An 
element (C, ¢) will be called admissible provided its curve 


yi = yi(x) 


is admissible, and the equations 


ahh 4 f “hel, 9(2), 9'(2), 


have a continuous solution on the closed interval (a, 6). However, for the 
statement of the next two theorems it is convenient to split the z-variables 
into two groups, which will be denoted by w and z. The corresponding initial 
values will be denoted by w and¢. The w’s will be supposed already determined 
as functions w[x; C, w| defined and continuous in x for (C, w) in a certain 
class, and for a<23b), and the 2’s will be supposed to be determined as func- 
tions z[x; C, w, ¢] by the equations 


(3) Ze = Se +f y, y’, w, 2)dx. 


In case the functions /, are monotone increasing with respect to their argu- 
ments z,, the existence of a solution implies by Theorem 1 the existence of a 
uniquely determined Jeasé solution z[x; C, w, ¢]. The only other case we shall 
consider is that in which there is only one equation (3) and the function h is 
monotone decreasing with respect to z. Then by Theorem 2, there is at most 
one solution. The function z, |x; C, w, ¢], regarded as defined on a class K of 
admissible elements (C, w, ¢), will be said to be lower semi-continuous at an 
element (Co, wo, fo) of K uniformly with respect to x, in case for every positive 
number u there is a positive number y such that z, [x; C, w, ¢] >2-[x; Co, wo, fo] 
—p for all x and for all (C, w, ¢) in K such that ||C, Col] <y, ||o—wol| <v, 
o| | 


T See Annals, p. 155. 


(a<x<b) 
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THEOREM 3. Suppose that each function h,(x, y, y’, w, 2) is non-negative, 
and is monotone increasing with respect to each w, and each z,. Suppose also that 
each function h, satisfies the Weierstrass condition 


— (Vi — yi )heyi(x, y, y’, 3) 2 0 


for all (x, y, y’, Y’, w, 2). Let K be a class of admissible elements (C, w, ¢) whose 
curves C are equally absolutely continuous, and suppose that the numbers 
Ze[b; C, w, ¢] defined by equations (3) are bounded on K. Let (Co, wo, £0) be an 
element of accumulation of K such that the functions wo, =w,|x; Co, wo] are de- 
fined and each w,|x;C,w] as on K is lower semi-continuous at (Co, wo) uniformly 
with respect to x. Then the element (Co, wo, ¢o) is admissible, and each z,|x; C, w, 
¢] as on K is lower semi-continuous at (Co, wo, ¢o) uniformly with respect to x. 


Let (Cn, wn, fn) be an arbitrary sequence of elements of K converging to 
(Co, wo, fo). Since each h,=0, the corresponding functions =2.[x; Cn, 
@n, ¢n| are monotone increasing functions of x, and since they are also uni- 
formly bounded there exists} a sub-sequence which converges for every x and 
o to a function 2*(x). The same notation will be used for this sub-sequence 
as for the original sequence. Let z**(x) be step functions, continuous on 
the right, and everywhere less than z*(x). Then for m sufficiently large, 
Zno(X) >2**(x) at each discontinuity of z** and hence for all x, since the func- 
tions Z,. are monotone increasing in x. Moreover, for every 6 >0 we have for 
sufficiently large, w,-(x) =w,[x; w.]2wor(x)—6. Hence for sufficiently 
large n, 


Zne(X) — Sne = f he(X, Yny Yn» Wny dx 


= f he(X, Yn) Yn Wo — 5, 2**)dx. 


By known theorems on lower semi-continuityft 


2e*(x) — foe = lim int f he(X, Yny Yn» Wo — 4, 2**)dx 


= f he(x, Yo, Yo , Wo — 4, 2**)dx. 
aq 


{ See Helly, Lineare Funktionaloperationen, Sitzungsberichte der Wiener Akademie, (IIA), 
vol. 121 (1912), p. 283. 

t See Annals, pp. 164, 165. Hypothesis (II) of Annals is not required here, since the curves of 
K are equally absolutely continuous. 


| 
n 
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It is easy to construct a sequence of step functions 2** converging to 2* ex- 
cept at the discontinuities of the latter. Since 6 is also arbitrary, we have 
finally 


z 
S00 = f Yo, yo Wo, 2*)dx, 
ao 


and the proof implies the existence of the integral. From Theorem 1 it follows 
that the element (Co, wo, fo) is admissible, and 20.(x)=z,[x; Co, wo, fo] 
<2*(x) =lim z,.(x). Suppose now that the final statement of the theorem is 
not true. Then there exists a positive constant yu, a value of the index o, and 
sequences (Cy, Wn, fn) and (x,), such that (C,, wn, fn) converges to (Co, wo, £0), 
and Zno(Xn) <Zoe(Xn) —u. These sequences may be chosen as in the preceding 
part of the proof and also so that x, converges to a point #. Choose a so 
small that >20.()—u/3. Then for m sufficiently large, 2,.(*,) 
= Sno(F—A) >Zoe(%n) —u/3, and these inequalities 
with the two preceding readily yield a contradiction. 


THEOREM 4. Suppose the hypotheses of Theorem 3 hold, except that there is 
only one 2 and one h, and the function h is monotone decreasing with respect to 2. 
Let the function h satisfy the condition 


(L) for every admissible curve C and set of continuous functions w,(x), 2\(x), 
2°(x), for which h|x, y(x), y’(x), w(x), 2'(x)] is integrable, the function 
h\x, v(x), y’(x), w(x), 22(x) is also integrable.t 


Then the conclusions of Theorem 3 are still valid.t 


As before, we consider a sub-sequence (Cy, wn, ¢,) of an arbitrary sequence 
of elements of K converging to (Co, wo, such that z,(x) =z[x; Cr, Sn] 
converges to a function z*(«). Let z**(x) be a step function, continuous on the 
left, and everywhere greater than 2*(x), and let a and 8 be such that 
ay) <b». Then for sufficiently large, 


8 


+ This condition is implied by a suitable Lipschitz condition, or by the condition (r) of Mania. 
See Rendiconti del Circolo Matematico di Palermo, loc. cit., pp. 303, 307. 

t The following examples, in which the solutions of the differential equations fail to be lower 
semi-continuous, illustrate the need for the hypothesis that the function # shall be monotone in- 
creasing with respect to the variables w. It will be noticed that the hypotheses (III), (V) to (VIII) 
of the existence theorem of §2 are all fulfilled. Let w’ = —w(1+y,;?2+y?)"2+-y/2+-- 22, 2’=—w for 
w<0, and w’=y;?+-y,2, 2’=0 for w>0. Let the initial values be w= —2, ¢=0, for x=0. Then z is 
not lower semi-continuous at the curve Co: ¥:=0, y2=0 for x small, as may be seen by considering 
the sequence C,: y;=(cos nx)/n, y2=(sin nx)/n. Another example is the following: w’=e—"(1+/? 
s’=e™, with initial conditions o=¢=0 for x=0. 
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8 
= f h(x, Yny Yn» Wo — 6, 2**)dx, 
a 


where it is agreed that the integrand hk is to be replaced by zero at points of 
the interval (a, 8), if any, falling outside the interval (a,, b,.). By the same rea- 
soning as before we find that 


B 
2*(8) — 2*(a) f Wes 


From the condition (L) it follows that h(x, vo, v0 , wo, fo) is integrable, and by 
Theorem 2, (Co, wo, {o) is admissible. Suppose now that 2*(8) <20(8). Then 
since 29(x) =z[x; Co, wo, fo] is monotone increasing and continuous, and z*(x) 
is monotone increasing, there is a last point a<f at which 2*(a) =29(a). On 
the interval a<x <8, 2*(z) <zo(x), and hence 


8 B 
f h(x, Yo, Yo Wo, Zo)dx = f h(x, Yo, Wo, 2*)dx 
a a 


2*(B) — 2*(a) < 20(8) — 20(a). 


This is a contradiction. Now from the result zo(x) <2*(x) we may argue the 
uniform lower semi-continuity as before. 

2. A first existence theorem. We shall be concerned with functions 
g(a, b, n, Y, £, Z) and h,(x, y, y’, 2) satisfying the following hypotheses. 


(I) g(a, b, n, Y, ¢, Z) is defined and continuous for (6, Y) in the bounded 
closed set A, (a, 7, ¢) in a bounded closed set S in (x, y, 2)-space, and all Z, 
and is monotone increasing with respect to each argument Z,. 


(II) There is a non-null subset of [1<a<s], denoted by [o*], such that 
g(a, b, n, Y,£, Z)-++© with each Z,«, the remaining arguments being fixed. 


(III) The functions h,(x, y, y’, z) and their partial derivatives with respect 
to the y/ are defined and continuous for (x, y) in A and for all y’ and z. For 
the sake of greater generality in the following hypotheses we suppose that 
the range [1 <o<s] is divided into subsets by integers 0=ro<n<re< ---, 
in such a way that each h, having o <r; is independent of all the z, having 
p>r;,. It is understood of course that we may have m=s. 


(IV) Each function 4, is either (a) monotone increasing with respect to 
each argument z,; or (b) monotone increasing with respect to each z, for p<o 
and monotone decreasing with respect to z,, and ¢ =r; =1+7;-1, i.e., ¢ belongs 
to a subset consisting of only one element. 


* 
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(V) Each function h, which is monotone decreasing with respect to 2, 
satisfies the condition (L) stated in Theorem 4, with the functions w,(x) re- 
placed by 2(x), - - , 


(VI) The functions h, satisfy the Weierstrass condition, i.e., E,(x,y, y’, Y’, 
z) 20 for (x, y) in A, and all y’, Y’, z. 


(VII) The functions h, are non-negative for (x, y) in A and all y’, z. 


(VIII) At least one of the functions h, satisfies the condition that for 
every positive number N there exists a constant M and a function ®() such 
that (1) ® is positive, continuous, and monotone increasing for u=0; 
(2) +2 ; (3) he(x, ¥, for all (x, y) in A, 
and |{z|| ||y’|| >. 


(IX) For every ¢ not in the class [o*] there exist a o*, a positive constant p, 
and an integrable function v(x) such that h,(x, y, y’, y, y’, 2) +r(x) 
for all (x, y) in A and all y’, z. 


Admissible elements (C, ¢) have been defined in §1. A class K of admissible 
elements (C, ¢) is said to be closed in case it contains all its elements of ac- 
cumulation (Co, fo) which are admissible. For example, a class consisting of 
all admissible elements (C, ¢) whose end values satisfy equations of the form 


v,[a, 6, y(a), y(b), ¢] =0 is a closed class, provided the y, are continuous. A 
class K will be called quasi-closed in case it contains all elements (Co, fo) 
which are admissible and are limits of sequences {C,, ¢,} chosen from K, 
whose curves C, are equally absolutely continuous. The existence theorems 
will be stated for quasi-closed classes, since they are then more generally 
applicable. We note that if K is a class of admissible elements (C, £) having 
initial values (a, y(a), ¢) in the set S, then the function 


I(C, = g(a, b, y(a), y(b), ¢, 2[b; C, ¢]) 


is well defined on K. 

EXISTENCE THEOREM. Let the functions g and h, satisfy the preceding hy- 
potheses (1)—-(IX), and let K be a non-null bounded quasi-closed class of admissi- 
ble elements (C, £) having initial values (a, y(a), ¢) in the set S. Then I(C, £) has 
a minimum on K. 

3. Proof of the existence theorem. We first prove two lemmas. 

Lemna 1. Let K, be a class of admissible elements (C, £) on which the func- 
tions 2,|x; C, ¢] are all bounded. Then the curves of K, are equally absolutely 
continuous. 
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Let h,, be a function satisfying hypothesis (VIII). Let N 2||z[x; C, ¢]]| 
and B2z,,[b; C, on Ky, and let >M], B=E[||y'|| <M], 
D=maximum difference of abscissas of points in A. Then 

a BE, E2 
< B+Déa@M). 
From this inequality the equal absolute continuity of the curves of K, readily 
follows.t 

Lemma 2. Let K, be a class of admissible elements (C, ¢) having initial values 
(a, y(a), ¢) in the set S, and let I(C,¢) be bounded above on K,. Then each func- 
tion z,[b; C, ¢] is bounded on K,. 

Suppose the lemma is false, and let {C,, ¢,} be a sequence selected from 
K,, such that the corresponding sequence 


{an, Bn, Yn(Gn), Yn(bn), Zn} — (a0, bo, no, Yo, £0, Zo), 


where Z,, =2[b,;C,, ¢,], and at least one component Zo, = + ©. By hypothesis 
(IX) there must be a o* such that Zo = + ©. The sequences Z,,, are bounded 
below, say =B,, since the set S is bounded and each h, =0. Let 2%, =B, foro 
different from the particular o* just mentioned , and Z7«=Z,. Then for 


each n there is an m>n such that when g>m, 


which is bounded. Hence g[do, bo, no, Yo, (0, Z*] is bounded, which contra- 
dicts hypothesis (II). 

To complete the proof of the theorem, let 7 be the greatest lower bound of 
I(C, £) on the class K. It is easily seen that 7 is finite. If {C,, ¢,} is a mini- 
mizing sequence, it follows from Lemmas 2 and 1 that each component 
Ze[bn; Cn, &n] is bounded and that the curves C,, are equally absolutely con- 
tinuous. From this we know by applying the theorems of Ascoli and Weier- 
strass-Bolzano that the sequence {C,, ¢,} has an element of accumulation 
(Co, fo). The curve C, is absolutely continuous and lies in A, while the initial 
values (do, yo(a@o), fo) lie in the set S. From Theorems 3 and 4 it follows that 
the element (Co, £0) is admissible and hence in the class K, and that 


lim inf ze[bn; Cn, = 20 [b0; Co, fo]. 


From the properties assumed for the function g it follows without difficulty 
that 7=lim T(C,, on) =1(Co, fo) >i. 


t See Nagumo, Ueber die gleichmaessige Summierbarkeit und ihre Anwendung auf ein V ariations- 
problem, Japanese Journal of Mathematics, vol. 6 (1929), p. 173. 
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4. Generalizations. The existence theorem stated in §2 may be general- 
ized in several ways. Let us consider first certain restrictions on the range of 
y’. Let Poi, Vp, 9¢:, O, o be functions of x and y; which, together with the 
partial derivatives ¢,., $,;, are defined and continuous on the set A. Let R 
denote the set of (x, y, y’) points having (x, y) in A and satisfying the condi- 
tions 


dry, Vi + = 0, 
+V¥,=0, +O0,2 0. 


A curve C: y;=¥y;(«) will be said to lie in R in case its elements (x, y(x), y’(x)) 
lie in R for almost all x. The hypotheses (III)—(IX) inclusive may be modified 
by assuming that they hold only for (x, y, y’) in R, except that (VI) has the 
form E,(x, y, y’, Y’, z) 20 for every (x, y, Y’) and (x, 9, y’) in R and every z.f 
Let the class K of the existence theorem satisfy the additional condition that 
its curves C lie in R. Then /(C, ¢) has a minimum on K. The proof is essen- 
tially the same as before. 

It is important to note that if K» is a given quasi-closed class of admissible 
elements (C, ¢), and K consists of all the elements of Ky) whose curves C lie 
in R, then K is also quasi-closed. This is a consequence of the 


Lemma 3. Let {C a be a sequence of curves in R, equally absolutely continu- 
ous, and converging to a curve Cy. Then Cy also lies in R. 


It is clear that the equations ¢,(x, y) =0 will be satisfied by Co. Consider 
now the function w defined by an equation of the form 


w[x;C] = yy + ¥(x, 


Since the Weierstrassian E-function vanishes identically for an integrand 
linear in y’, it follows from a theorem on semi-continuity previously quoted 
that lim w[x; C,]=w[x; Co]. If each wx; C,] vanishes identically, so does 
w[x; Co]. If each w[x; C,] is a monotone increasing function of x, so is 
w|x; Co]. This argument shows at once that C» lies in R. 

As a second generalization we may impose certain restrictions on the 
range of z, and Z,. Let 8,(x, y) be functions defined on the set A, with values 
which are finite or — ©, bounded above, and continuous except for infinite 
discontinuities. The subset of A on which a given 8, =—© is then unre 
stricted except that it is closed. Let B,(x, y) be similar functions, with values 
finite or +, and bounded below. The hypotheses are modified as follows: 


t See Graves, Bulletin of the American Mathematical Society, loc. cit. In certain cases this 
hypothesis may be weakened by requiring = y. A modification of the proof is required. 
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(I) holds only for 6,(b, y) S$ Z,<B,(b, Y), and the initial values in the set S 
satisfyt 8.(a, n) <{.<B.(a, n). (II) has no meaning and is omitted for those 
values of o*, b, Y for which B,+(b, Y) is finite. The remaining hypotheses 
(III)—(IX) are understood to hold only for 8,(x, y) Sz, <B,(x, y). It is easily 
seen that the range of the functions , may be extended to the whole z-space 
by setting h,(x, z) =h,(x, z), where =6,(x, y) if Zp <B,(x, y), and 
2,=B,(x, y) if z,>B,(x, y). The original hypotheses (III)—(IX) are satisfied 
by the functions 4, so extended. With reference to the extended functions h, 
the term “admissible element (C, ¢)” will be used in the same sense as before. 

Let Ky be a quasi-closed class of admissible elements, and let K be the 
sub-class of Ky consisting of all those elements (C, ¢) having initial values 
(a, y(a), ¢) in the set S, {,=maximum of £8,(x, y(x)), and 2,[x; C, ¢] 
<B,(x, y(x)). Then from Theorems 3 and 4 it follows that K will be quasi- 
closed. The existence theorem is evidently valid for quasi-closed classes K 
of elements satisfying the conditions just written down. 

As a third generalization we note that certain of the hypotheses may be 
weakened as follows. In place of (VIII) we may assume that there exists a 
function f(x, y) of class C’ on the set A such that one of the functions 
y, y’, 8) =he(x, v’, 2) +fhe(x, y) +fy,(x, y)y/ satisfies (VIII). Condition 
(VIII) enters only in the proof of Lemma 1, which we revise as follows; 


b 
f y'||)dx < f + | &(M)dx 
a By 


he dx + (M)dx + f(b, y(b)) 


Ey 
fla, f fe + Jae. 


Since ||y’|| is uniformly bounded on the set E2, the additional terms are easily 
seen to be bounded. 

In place of (VII) we may assume that there exist functions F,(x, y), of 
class C’ on A, such that the functions h,t (x, y, y’, zt) =h.(x, y, y’, zt —F (x, y)) 
+F,.(x, y)+F.y,(x, y)yi satisfy condition (VII). If we use also the transforma- 
tion =2.+F (x,y), Sot =f. +Fo(a, y(a)), gt (a,b, 0, ot, Zt) =g(a,b, 0, 
—F(a, n), Zt—F(b, Y)), we see that the transformed functions still satisfy 
the hypotheses (I)-(VI) and the weakened form of (VIII). In case there are 
restrictions 8,(x, y) $2, <B,(x, y) on the z,, we note that the transformed re- 


¢ Whenever a 8, or B, is infinite, the corresponding inequality is to be omitted. 


A 
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gions 8,(x, y)+F.(x, y) Sz.t SB,(x, y)+F.(x, y) have all the properties re- 
quired of the original ones. 

Condition (IX) may be modified in the same way as (VII), but the func- 
tions F,(x, y) need not be the same, since (IX) enters only in the proof of 
Lemma 2, and the transformed functions z,f are bounded if and only if the z, 
are bounded. 

Finally, the restriction that the set A of (x, y) points is bounded may be 
removed in the usual way. Hypotheses (VIII) and (IX) are assumed to hold 
only for every bounded subset A; of A. Hypotheses (I)-(VII) require no 
change. It is assumed that the end values (a, n, ¢) and (b, Y) of elements 
(C, ¢) of the class K in which a minimum is sought lie in bounded closed sets. 
We require in addition the condition 


(X) There is a value o* of the index ¢, belonging to the class mentioned 
in hypothesis (II), such that z,+[b; C, ¢] ++ with the maximum distance 
of a point of C from the origin in (x, y)-space. 


Under these hypotheses, Lemma 2 still holds and its proof implies also 
that the curves C involved lie in a bounded subset of the set A. 

The condition (X) is implied by various conditions more directly applica- 
ble to the functions /;, as has been indicated for simpler problems by Tonellif 
and by Graves.§ 


Part II. PROBLEMS IN PARAMETRIC FORM 


5. Lower semi-continuity of solutions of differential equations. The the- 
ory for problems in parametric form involves only the usual modifications of 
the preceding. We let A denote a bounded closed set in the k-dimensional 
y-space. A curve 


(a sis b) 


is admissible provided it is rectifiable and lies in A. We shall always suppose 
the parameter ¢ so chosen that the functions y,(¢) are absolutely continuous. 
Let the functions h,(y, y’, z) together with their partial derivatives h,,;, be 
defined and continuous for y in A, all y’ 0, and all z, and let each h,(y, y’, 2) 
be positively homogeneous of degree one in the arguments y/. An element 
(C, £) is admissible provided C is admissible and the equations 


t 
(4) colt) = fe + 


t Fondamenti, vol. I, pp. 308, 311. 
§ Annals, p. 168; Bulletin of the American Mathematical Society, loc. cit. 
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have a continuous solution z,[¢; C, ¢] on aSt#<b. On account of the homo- 
geneity assumed for the functions h,, solutions of these equations are in- 
variant under change of parameter so long as the functions y,(¢) remain ab- 
solutely continuous. As before, we shall suppose for the next two theorems 
that the functions /, depend also on certain variables w,, and that the w’s 
are already determined as functions w,[¢; C, w] of C and initial values w. 
The function z,[#; C, w, ¢], regarded as defined on a class K of admissible 
elements (C, w, £), will be said to be uniformly lower semi-continuous at an ele- 
ment (Co, wo, fo) of K in case for every positive number uy there is a positive 
number ¥ such that, for every element (C, w, ¢) in K and choice of parameter 
such that 


— <v, — toll < v, — 


we have C, w, >z-[t; Co, wo, fo] (ast). 


THEOREM 5. Suppose that each function h,(y, y’, w, 2) is non-negative, and 
is monotone increasing with respect to each w, and eachz,. Suppose also that each 
function h, satisfies the Weierstrass condition E,(y, y', Y', w, 2) =he(y, Y’, w, 2) 
—V/ y’, w, 2) 20 for all y in A, and all w and z. Let K 
be a class of admissible elements (C, w, £) whose curves C have bounded lengths, 
and suppose that the numbers z,[b; C, w, ¢] defined by equations (4) are bounded 
on K. Let (Co, wo, £0) be an element of accumulation of K such that the functions 
Wor =W,[t; Co, wo] are defined and each w,[t; C, w| as on K is uniformly lower 
semi-continuous at (Co, wo). Then the element (Co, wo, &o) is admissible, and each 
Zo [t;C,w, ¢] as on K is uniformly lower semi-continuous at (Co, wo, fo). 


Let (C,, wn, £n) be an arbitrary sequence of elements of K, converging to 
(Co, wo, fo). We may choose an arbitrary fixed representation of Co: y;=yo:(t), 
and then select the representation of C,: y;=yni(¢) in any manner such that 
the functions y,;(¢) converge uniformly to yo;(¢). We proceed as in the proof 
of Theorem 3 to showf that the element (Co, wo, fo) is admissible and that 
for a properly chosen sub-sequence, lim 2,[t; Cn, @n, €n]2Ze[t; Co, wo, fo]. 
Suppose that the uniform lower semi-continuity at (Co, wo, £0) is false. Then 
there exist a positive number y, a value of the index a, and a sequence of ele- 
ments (Cy, wn, fn) converging to (Co, wo, fo) with representations C,,: yi=yni(t), 
Co: vi= y(t), such that ||y,(¢)—y0" (¢)|| +0 uniformly, while for a properly 
selected value ¢, of the parameter, 2,.(¢,) S 2,(tn) —u. Let yi=voi(s) be the 
representation of Cy in terms of its arc length as parameter, and let the func- 
tions Z9,(s) correspond to this representation. Then by means of functions 


t For the lower semi-continuity of /he(y, y’, wo—5, 2**)dt, see McShane, Dissertation, p. 11, 
Theorem I. 
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t=, (s) we may transform the parameter on the curves C,, so that for the new 
representation y;=¥,:(s) we still have ||¥,(s) —yo(s)|| uniformly. To the 
values ¢, of the parameter there correspond values s, and §, such that 
Zno(Sn) SZ00(Sn) —m. It is easily verified that s, —5,—0. From this and from the 
preceding part of the proof it follows that the sequence {C,, wn, ¢,} and a 
value may be so chosen that s,—50, 5,50, and lim 2,¢(S)—@) = Zo0(So—@), 
where a is an arbitrary positive number. Then for a@ sufficiently small and n 
sufficiently large, —@) >Z00(Sn) —u/2, Eno(Fn) and from these 
inequalities and the preceding we readily obtain a contradiction. 


THEOREM 6. Let the hypotheses of Theorem 5 hold, except that there is only one 
z and one h, and the function h is monotone decreasing with respect to z. Then the 
conclusions of Theorem 5 are still valid. 


In this case the element (Co, wo, fo) is seen at once to be admissible, by 
virtue of Theorem 2, since the function h(yo, yé , wo, fo) is bounded when the 
arc length is chosen as parameter. The remainder of the proof may be made 
as for Theorems 4 and 5. 

6. An existence theorem. For the parametric problem certain of the hy- 
potheses made in §2 are modified as follows. 


(I) The arguments a and db are omitted from g. 


(III) The functions /, are independent of x, and are positively homogene- 
ous of degree one in y’. The partial derivatives /ey; are not defined for y’ =0. 


(V) is omitted. 


(VIII) At least one function , satisfies the condition that for every posi- 
tive number N there exists a positive constant m such that h,(y, y’, z)=m 
for all yin A, ||y’|| =1, and ||z/| <V. 


(IX) For every o not in the class [o*] there exist a o* and a positive con- 
stant u such that h,(y, y’, 2) Suha(y, y’, z) for all y in A, all y’ and all z. 


A quasi-closed class of admissible elements (C, ¢) is now defined to be a 
class K which contains all elements (Co, fo) which are admissible and are 
limits of sequences {C,, ¢,} chosen from K whose curves C, have uniformly 
bounded lengths. 


EXISTENCE THEOREM. Let the functions g and h, satisfy the hypotheses (1)- 
(IV) and (VI)—(IX) as modified, and let K be a non-null bounded quasi-closed 
class of admissible elements (C, ¢) having initial values in the set S. Then I(C, ¢) 
has a minimum on K. 


The proof follows the same lines as before, except for the obvious modifi- 
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cation in Lemma 1. Moreover, the theorem admits of the same generalizations 
as those described in §4 for the non-parametric problem. There is one devia- 
tion in the generalized hypotheses, in that when there are differential equa- 
tions ¥,:(y)y/ =0, etc., linear in the y/, the condition (VI) now has the form 
E.(y, y’, Y’, z)20 for every (j, Y’) and (y, y’) in the set R and every z. 
This difference is due to the difference in the method of proving the lower 
semi-continuity of a parametric integral from the method used in the non- 
parametric case. 
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SOME PROPERTIES OF CONVERSION* 


BY 
ALONZO CHURCH anp J. B. ROSSER 


Our purpose is to establish the properties of conversion which are ex- 
pressed in Theorems 1 and 2 below. We shall consider first conversion defined 
by Church’s Rules I, II, III} and shall then extend our results to several other 
kinds of conversion. 

1. Conversion defined by Church’s Rules I, II, III. In our study of con- 
version we are particularly interested in the effects of Rules II and IIT and 
consider that applications of Rule I, though often necessary to prevent con- 
fusion of free and bound variables, do not essentially change the structure of 
a formula. Hence we shall omit mention of applications of Rule I whenever it 
seems that no essential ambiguity will result. Thus when we speak of replac- 
ing {Ax.M}(N)§ by SyM| it shall be understood that any applications of I 
are made which are needed to make this substitution an application of ITI. 
Also we may write bound variables as unchanged throughout discussions even 
though tacit applications of I in the discussion may have changed them. 

A conversion in which III is not used and II is used exactly once will be 


called a reduction. If II is not used and IIT is used exactly once, the conversion 
will be called an expansion. “A imr B,” read “A is immediately reducible to 
B,” shall mean that it is possible to go from A to B by a single reduction. 
“A red B,” read “A is reducible to B,” shall mean that it is possible to go 
from A to B by one or more reductions.|| “A conv-I B,” read “A conv B by 
applications of I only,” shall mean just that (including the case of a zero num- 
ber of applications). “A conv-I-II B,” read “A conv B by applications of I 


* Presented to the Society, April 20, 1935; received by the editors June 4, 1935. 

t By Church’s rules we shall mean the rules of procedure given in A. Church, A set of postulates 
for the foundation of logic, Annals of Mathematics, (2), vol. 33 (1932), pp. 346-366 (see pp. 355-356), 
as modified by S. C. Kleene, Proof by cases in formal logic, Annals of Mathematics, (2), vol. 35 (1934), 
pp. 529-544 (see p. 530). We assume familiarity with the material on pp. 349-355 of Church’s paper 
and in §§1, 2, 3, 5 of Kleene’s paper. We shall refer to the latter paper as “Kleene.” 

t The authors are indebted to Dr. S. C. Kleene for assistance in the preparation of this paper, 
in particular for the detection of an error in the first draft of it and for the suggestion of an improve- 
ment in the proof of Theorem 2. 

§ Note carefully the convention at the beginning of §3, Kleene, which we shall constantly use. 

|| Our use of “conv” allows us to write “A conv B” even in the case that no applications of I, 
II, or III are made in going from A to B and A is the same as B. But we write “A red B” only if 
there is at least one reduction in the process of going from A to B by applications of I and II, and 
use the notation “A conv-I-II B” if we wish to allow the possibility of no reductions. 
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and II only,” shall mean just that (including the case of a zero number of 
applications). 

We shall say that we contract or perform a contraction on {\x.M}(N) if 
we replace it by SyM]. 

It is possible to visualize the process of conversion by drawing a broken 
line in which the segments correspond to successive steps of the conversion, 
horizontal segments indicating applications of I, segments of negative slope 
applications of II, and segments of positive slope applications of III. Thus, 
in the figure A: conv Ae and B, conv B: each by a single use of I, Az conv As; 
and Bz corv B; each by a single use of III, and A; conv As and C; conv C2 
each by a single use of II. The dotted lines represent various alternative con- 
versions to the conversion given by the solid line. 


A conversion in which no expansions follow any reductions will be called 
peak and one in which no reductions follow any expansions will be called a 
valley. The central theorem of this paper states that if A conv B, there is a 
conversion from A to B which is a valley. We prove it by means of a lemma 
which states that a peak in which there is a single reduction can always be 
replaced by a valley. Then the theorem becomes obvious. For example, in 
the conversion pictured by the solid line in the figure we replace the peak 
AA; - - - As by the valley A,:B:iB2B;As, then the peak B2B;AsAy by the valley 
BeAg, then the peak - - - Au by the valley AisC:C2C;Au, getting the 
valley A,B, 

Suppose that a formula A has parts {Ax;.M;}(N,) which may or may not 
be parts of each other (cf. Kleene 2VIII (p. 532)). We suppose that, if pq, 
{\x,.M,}(N,) is not the same part as {Ax,.M,}(N,), though it may be the 
same formula. The {Ax;..M;}(N,) need not be all of the parts of A which have 
the form {Ay. P}(Q). We shall define the residuals of the {\x;.M;}(N,) after 
a sequence of applications of I and II (these residuals being certain well- 
formed parts of the formula which results from the sequence of applications of 


As Ar 
A, As 
A 
Ay 2 13 
Ag Aun Ay» 
B, B A 

C; 
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I and II). If no applications of I or II occur, each part {Ax;.M;} (Nj) is its 
own residual. If a series of applications of I occur, then each part {Ax;. M;} (N,) 
is changed into a part {Ay;.M/ }(N/) of the resulting formula and this part 
{dy;.M}}(N/) is the residual of {Xx;.M;}(N,) in the resulting formula. 
Clearly the residuals of two parts coincide only if the parts coincide. Next 
consider the case where a single reduction occurs. Let {\x.M}(N) be the part 
of A which is contracted in performing the reduction and A’ be the formula 
resulting from A by the reduction. Let {Ax,.M,}(N,) be an arbitrary one 
of the {Ax;.M;}(N)). 

Case 1. Let {Ax,.M,}(N,) not be part of {Ax.M}(N). Then either (a) 
{\x.M}(N) has no part in common with {A\x,.M,}(N,) or else (b) 
{rx.M}(N) is part of {Ax,.M,}(N,) (by Kleene 2VIII). If (a) holds, 
then under the reduction from A to A’, {Ax,.M,}(N,) goes into a definite 
part of A’ which we shall call the residual in A’ of {\x,.M,}(N,). In this 
case the residual of {\x,.M,}(N,) is the same formula as {Ax,.M,}(N,). If 
(b) holds then {Ax.M} (WN) is part either of M, or of N, since {\x,.M,}(N>,) 
is not part of {Ax.M}(N) (see Kleene, 2X and 2XII). Hence if we contract 
{\x.M}(N) we perform a reduction on the formula {\x,.M,}(N,) which 
carries it into a formula {\x,/ .M; }(N,/). Then the reduction from A to A’ 
can be considered as consisting of the replacement of the part {Ax,.M,}(N>) 
of A by .M; }(N;) and this particular occurrence of }(N;) 
in A’ is called the residual in A’ of {Ax,.M,}(N,). 

Case 2. Let {Ax,.M,}(N,) be part of {Ax.M}(N). By Kleene 2X this 
case breaks up into three subcases. 

(a) Let {Ax,.M,}(N,) be {\x.M}(N). Then we say that {Ax,.M,}(N,) 
has no residual in A’. 

(b) Let {Ax,.M,}(N,) be part of \x. M and hence part of M (by Kleene 
2XII). Let M’ be the result of replacing all free x’s of M except those occur- 
ring in {Ax,.M,}(N,) by N. Under these changes the part {Ax,.M,}(N,) of 
M goes into a definite part of M’ which we shall denote also by {Ax,.M,}(N,), 
since it is the same formula. If now we replace {\x,.M,}(N,) in M’ by 
Sw {Axp.M,}(N,)|, M’ becomes Sy M| and we denote by Sy {Ax,.M,}(N,)| 
the particular occurrence of Sy {Axp.M,}(N,)|in SyM| that resulted from re- 
placing {\x,.M,}(N,) in M’ by the formula Sy {\x,.M,}(N,)|. Now the re- 
sidual in A’ of {\x,.M,}(N,) in A is defined to be the part Sy {Ax,. Mp} (N,)| 
in the particular occurrence of SyM| in A’ that resulted from replacing 
{Ax.M}(N) in A by Sy M|. 

(c) Let {Ax,.M,}(N,) be part of N. And let {dy;. Pi} (Q;) respectively 
stand for the particular occurrences of the formula {Ax,.M,}(N,) in SyM| 
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which are the part {Ax,.M,}(N,) in each of the particular occurrences of the 
formula N in SyM| that resulted from replacing the free x’s of M by N. Now 
the residuals in A’ of {Ax,.M,}(N,) in A are the parts {Ay,. P;}(Q,) in 
the particular occurrence of the formula SyM| in A’ that resulted from re- 
placing {\x.M}(N) in A by SyM]|. 

This completes the definition of the residuals in A’ of {Ax,.M,}(N,) 
in A in the case that A’ is obtained from A by a single reduction. Clearly the 
residuals in A’ of {\x,.M,}(N,) in A (if any) have the form {Ax. M} (NV). Call 
them {Ay;. P:}(Q;). Then if A’ imr A’’, the residuals in A’’ of {\x,.M,} (Np) 
in A are defined to be the residuals in A’’ of the {Ay;. P:}(Q,) in A’. We con- 
tinue in this way, defining the residuals after each successive reduction as 
the residuals of the formulas that were residuals before the reduction, and 
noting that the residuals always have the form {Ax.M}(N). We also note 
that a residual in B of the part {\x.M}(N) in A cannot coincide with a 
residual in B of the part {\x’.M’}(N’) in A unless {\x.M}(N) coincides 
with {Ax’.M’}(N’). 

We say that a sequence of reductions on A, say Aimr A; imr A: imr- - - 
imr An41, is a sequence of contractions on the parts {\x;.M;}(N,) of A if the 
reduction from A; to Ais: (i=0, - - - , 2; Ao the same as A) is a contraction 
on one of the residuals in A; of the {Xx;.M;}(N,). Moreover, if no residuals 
of the {Ax;.M;}(N,) occur in A,,, we say that the sequence of contractions 
on the {\x;.M;,}(N,) terminates and that A,4: is the result. 

In some cases we wish to speak of a sequence of contractions on the parts 
{\x;.M;}(N,) of A where the set {Xx;.M;}(N;) may be vacuous. To handle 
this we shall agree that if the set {Xx;.M,}(N;) is vacuous, the sequence of 
contractions shall be a vacuous sequence of reductions. 


Lema 1. If {\x;.M;}(N,) are parts of A, then a number m can be found 
such that any sequence of contractions on the {dx;.M;}(N;) will terminate after 
at most m contractions, and if A’ and A"’ are two results of terminating sequences 
of contractions on the {dx;.M;}(N;), then A’ conv-I A’”’. 


Proof by induction on the number of proper symbols of A. 

The lemma is true if A is a proper symbol, the number m being 0. 

Assume the lemma true for formulas with or less proper symbols. Let A 
have proper symbols. 

Case 1. A is Ax. M. Then all the parts {dx,;.M;} (N;) of A must be parts 
of M. However M has only m proper symbols and so we use the hypothesis 
of the induction. 

Case 2. Ais {F}(X). 

(a) {F}(X) is not one of the {dx;.M;}(N,). Then any sequence of con- 
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tractions on the parts {Ax;.M;}(N;) of A can be replaced* by two sequences 
of contractions performed successively, one on those of the {Ax;.M;}(N;) 
which are parts of F and one on those of the {\x;.M;}(N;) which are parts 
of X, in such a way that if the original sequence of contractions carried 
{F}(X) into {F’}(X’) then the two sequences of contractions by which it 
is replaced carry F into F’ and X into X’ respectively and the total number 
of contractions on residuals of parts of F or on residuals of parts of X is the 
same as before.t Then we use the hypothesis of the induction, since F and X 
each have ” or less proper symbols. 

(b) {F}(X) is one of the {dx;.M;}(N,), say {\xp.M,}(N,). As long as 
the residual of {Ax,.M,}(N,) has not been contracted, the argument of (a) 
applies. Hence if any sequence of contractions on the {Ax;.M;}(N,) of A 
is continued long enough, the residual of {\x,.M,}(N,) must be contracted 
(since we prove readily that one and only one residual of {Ax,.M,}(N,) 
occurs, until a contraction on the residual of {\x,.M,}(N,)). Let a sequence 
of contractions, wu, on the {Ax;.M;}(N;) consist of a sequence, ¢, of contrac- 
tions on the {dx;.M;}(N,) which are different from {\x,.M,}(N,), a con- 
traction, 8, on the residual of {Ax,.M,}(N,), and a sequence, 6, consisting 
of the remaining contractions of wu. Now, as in (a), we can replace ¢ by a se- 
quence, a, of contractions on the {Ax;.M;}(N,) which are parts of \x,.M, 
(and therefore parts of M,), followed by a sequence, 7, of contractions on the 
{\x;.M;}(N,) which are parts of N,, and this without changing the total 
number of contractions on residuals of parts of M, or on residuals of parts of 
N,. Then 7 followed by 6 can be replaced by 8’, a contraction on the residual, 
{Ay. P}(N,), of {Ax,.M,}(N,), followed by a set of applications of 7 on 
each of the occurrences of N, in Sy’, P| that arose by substituting N, for y in 
P. Our sequence of contractions now has a special form, namely a sequence of 
contractions, a, on parts of M,, followed by a contraction, 8’, on the residual 
of {Ax,.M,}(N,), followed by other contractions. We will now indicate a 
process whereby this sequence of contractions can be replaced by another 
having the same special form but having the property that after the con- 
traction on the residual of {\x,.M,}(N,) one less contraction on the re- 
siduals of parts of M, occurs. This process can then be successively applied 


* We say that a sequence, yu, of reductions on A can be replaced by a sequence, v, of reductions 
on A, if both « and » give the same end formula B and residuals in B of any part {dx;.M;}(N,) are 
the same under both yu and ». 

t The reader will easily understand the convention which we use when we say that the same 
sequence of contractions which carries F into F’ will carry {F}(X) into {F’}(X) and that the same 
sequence of contractions which carries X into X’ will carry { F’}(X) into {F’}(X’). This convention 
will also be used in (b). 
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until no contraction on a residual of a part of M, occurs after the contrac- 
tion on the residual of {\x,.M,}(N,). Moreover tl is process increases by 
one the number of contractions which precede the contraction on the residual 
of {\x,.M,}(N,), so that the total number of contractions on residuals of 
parts of M, is the same after the process as before, and hence the same as in yp. 

Let us consider that sequence of reductions ¢ which is composed of f’ 
and the contractions that follow it, up to and including the first contraction 
on a residual of a part of M,. Denoting the formula on which ¢ acts by 
{xy. P} (N,), we see that ¢ can be considered as the act of first replacing the 
free y’s of P by various formulas, N,x, got from N, by sets of reductions, and 
then contracting on a residual, {Az. R} (S), of one of the {Ax;.M;} (Nj) which 
are parts of M,, say {\x,.M,}(N,). From this point of view, we see that 
none of the free z’s of R are parts of any N,,, and hence ¢ can be replaced by 
a contraction on the residual in Ay. P of {Ax,.M,}(N,.) of which {dz.R} (S) 
is a residual, followed by a contraction on the residual of {Ax,.M>}(N,), fol- 
lowed by contractions on residuals of parts of N,. This completes the indica- 
tion of what the process is. 

Hence yw can be replaced by a sequence of contractions, a, on the 
{x,;.M;}(N;) which are parts of M, (such that a contains as many contrac- 
tions as there are contractions in yu on residuals of parts of M,), followed by 
a contraction, 8, on the residual of {\x,.M,}(N,), followed by a sequence of 
contractions, y, on residuals of parts of N,. Moreover, after a and 8, 
{\x,.M,}(N,) has become a formula containing several occurrences of N> 
and y is a sequence of contractions on parts of these occurrences of Np. 
Hence, since \x,.M, and N, each contain or less proper symbols, we can 
use the hypothesis of the induction in connection with a and y to show that 
if a followed by 6 followed by y terminates the result is unique to within ap- 
plications of I. But if uw terminates, so does a followed by 8 followed by y. 
Hence the results of any two terminating sequences are unique to within ap- 
plications of I. 

It remains to be shown that a number m can be found such that each se- 
quence of contractions terminates after at most m contractions. 

By the hypothesis of induction, a number a can be found such that any 
sequence of contractions on those {\x,;.M;}(Nj;) which are parts of M, 
terminates after at most a contractions, and a number 6 can be found such 
that any sequence of contractions on those {\x;.M. ;}(N;) which are parts of 
N, terminates after at most 6 contractions. Then any sequence of contrac- 
tions on the {Ax;.M;}(N,) of A must, if continued, include a contraction 
on the residual of {\x,.M,}(N,), after at most a+b+1 contractions. 
Hence we may confine our attention to sequences of contractions w on 
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the {dx;.M;}(N;) of A which include a contraction on the residual of 
{\x,.M,}(N,) and which can therefore be replaced by a sequence of con- 
tractions of the form, a followed by 6 followed by y. Moreover it will be 
clear, on examination of our preceding argument, that, when the sequence yu 
is replaced by the sequence, a followed by 8 followed by +, the total number 
of contractions in the sequence is either increased or left unchanged (because 
each step in the process of transforming u into a followed by 8 followed by 
has the property that it cannot decrease the total number of contractions). 
Therefore it is sufficient to find a number m such that any sequence of contrac- 
tions on the {Ax;.M. ;} (N;) of A which has the form, a followed by 6 followed 
by y, must terminate after at most m contractions. 

If we start with the formula M, and perform a terminating sequence of 
contractions on those {\x;.M;}(N;) which are parts of M,, the result is a 
formula M, , which is unique to within applications of I, and which contains 
a certain number, c, =1, of occurrences of x, as a free symbol. And in the 
case of any sequence of contractions on those {\x;.M;}(N;) which are parts 
of M,, whether terminating or not, the result (that is, the formula into which 
M, is transformed) contains at most ¢ occurrences of x, as a free symbol. 

Hence the required number m is a+1+cb. 


Lemma 2. If A imr B by a contraction on the part {\x.M}(N) of A, and 
A is A, and A; imr As, Az imr A;,--~- , and, for all k, By, is the result of a 
terminating sequence of contractions on the residuals in A; of {\x.M}(N), then: 
I. B, is B. 
II. For all k, By, conv-I-II 
III. Even if the sequence A, As,--- can be continued to infinity, there is 
a number @m, depending on the formula A, the part {\x.M}(N) of A, and the 
number m, such that, starting with B,,, at most dm consecutive B,’s occur for which 
it is not true that By, red By. 


Part I is obvious. 

We prove Part II readily as follows. Let {Ay;. P:}(Q;) be the residuals in 
A, of {\x.M}(N) and let A; imr Aj: be a contraction on the part 
{xz.R}(S) of Ax. Then By,4: is the result of a terminating sequence of con- 
tractions on {Az.R}(S) and the parts {Ay;. P;}(Q;) of Ax. Now if {\z.R}(S) 
is one of the {Ay;. P;} (Qi), then no residuals of {Az.R}(S) occur in B,, and 
B, conv-I By4:. If however {\z.R}(S) is not one of the {Ay;. P;} (Q;), then 
a set of residuals of {\z.R}(S) does occur in B, and a terminating sequence 
of contractions on these residuals in B; gives Bi.4: by Lemma 1. 

In order to prove Part III we note that B, red By4,: unless the reduction 
from A, to Ax4: consists of a contraction on a residual in A; of {Ax.M}(N) 
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(see preceding paragraph). But if we start with any particular A; this can 
only be the case a finite number of successive times, by Lemma 1. Hence we 
define ¢, as follows. Perform m successive reductions on A in all possible 
ways. This gives a finite set of formulas (except for applications of I). In 
each formula find the largest number of reductions that can occur in a termi- 
nating sequence of contractions on the residuals of {Ax.M}(N). Then let on 
be the largest of these. 


THEOREM 1. If A conv B, there is a conversion from A to B in which no 
expansion precedes any reduction. 


That is, any conversion can be replaced by a conversion which is a valley. 
This follows from Lemma 2 by the process already indicated: 


Coro.iary 1. Jf Bis a normal form* of A, then A conv-I-II B. 


For no reductions are possible on a normal form. 


Coro.iary 2. If A has a normal form, its normal form is unique (to within 
applications of Rule 1). 


For if B and B’ are both normal forms of A, then B’ is a normal form of B. 
Hence B conv-I-II B’. Hence B conv-I B’, since no reductions are possible 
on the normal form B. 

Note that only parts I and II of Lemma 2 are needed for Theorem 1 and 
its corollaries. 


THEOREM 2. If B is a normal form of A, then there is a number m such 
that any sequence of reductions starting from A will lead to B (to within applica- 
tions of Rule 1) after at most m reductions. 


We prove by induction on x that, if a formula B is a normal form of some 
formula A, and there is a sequence of m reductions leading from A to B, then 
there is a number Wa,, depending on the formula A and the number m such 
that any sequence of reductions starting from A will lead to a normal form of A 
(which will be B to within applications of I by Theorem 1, Corollary 2) in 
at most Wa,, reductions. 

If n=0, we take Wao to be 0. 

Assume our statement for =k. Let A imr C, C imr Ci, C; imr Co, ---, 
imr B. Let A be the same as A;, A: imr Ae, A: imr A;, - - . By Lemma 2 
there is a sequence (D, the same as C) D, conv-I-II D2, D, conv-I-II Ds, - - - , 
such that A; conv-I-II D; for all 7’s for which A; exists, and also, if the re- 
duction from A to D, (or C) is a contraction on {Ax.M}(N), such that, start- 
ing with D,,, at most ¢,, consecutive D,’s occur for which it is not true that 


* Kleene §5, p. 535. 
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D; red Dj,:. The sequence C imr C;, C,imr C2, -- - leads to Bin k reductions 
and so by the hypothesis of the induction there is a number Wc,, such that 
any sequence of reductions on C leads to a normal form (i.e., terminates) 
after at most Wc,, reductions. Hence there are at most Yc,, reductions in the 
sequence D, conv-I-II D2, D2 conv-I-II Ds, - - - , and hence this terminates 
after at most f(Wc,,) steps where f(p) is defined as follows: 


f(0) 
+1) = flo) +M +1, 


where M is the greatest of the numbers qu, , 41." 

Then, since the sequence of D,’s continues as long as there are A,’s on 
which reductions can be performed, it follows that after at most f(Wc,.) reduc- 
tions we come to an A; on which no reductions are possible. But this is 
equivalent to saying that this A; is a normal form. Hence any sequence of 
k+1 reductions from A to B determines an upper bound which holds for all 
sequences of reductions starting from A. We then take all sequences of k+1 
reductions starting from A (this is a finite set of sequences, since we reckon 
two sequences as the same if they differ only by applications of I) and find 
the upper bounds determined by each one of them that leads to a normal 
form. Then we define Wa,.4: to be the least of these upper bounds. 


This completes our induction. But, by Theorem 1, Corollary 1, if Bis a 
normal form of A, there is a sequence of some finite number of reductions 
leading from A to B. Hence Theorem 2 follows. 


Coroiary. If a formula has a normal form, every well-formed part of it 
has a normal form. 


2. Other kinds of conversion. There are also other systems of operations 
on formulas, similar to the system which we have been discussing and in 
which there can be distinguished reductions and expansions and possibly neu- 
tral operations (such as applications of Rule I). For convenience, we speak 
of the operations of any such system as conversions, and we define a normal 
form to be a formula on which no reductions are possible. 

A kind of conversion which appears to be useful in certain connections is 
obtained by taking a new undefined term 6 (restricting ourselves by never 
using 6 as a bound symbol) and adding to Church’s Rules I, II, III the fol- 


lowing rules: 


IV. Suppose that M and N contain no free symbols other than 6, that there 


* {(p) depends, of course, on the formula A and the part {Ax.M}(N) of A, as well as on 9, 
because ¢m depends on A and {Ax.M}(N). 


i 


1936] SOME PROPERTIES OF CONVERSION 481 


is no part {dz. P}(Q) of either M or N, and that there is no part 5(R,S) of either 
M or N in which R and S contain no free symbols other than 5. Then we may 
pass from a formula J to a formula K obtained from J by substituting for a par- 
ticular occurrence of 5(M,N) in J either \fx.f(f(x)) or Afx.f(x) according to 
whether it is or is not true that M conv-I N. 


V. The inverse operation of that described in Rule IV is allowable. That is, 
we may pass from K to J under the same circumstances. 


We call an application of Rule II together with applications of Rule I, 
or an application of Rule IV together with applications of Rule I, a reduction, 
and the reverse operations (involving Rule III or Rule V) expansions. We 
call any sequence of applications of various ones of the five rules a conversion. 
Also we say that we contract {\x.M}(N) if we replace it by SyM|, and that 
we contract 6(M, N) if we replace it by Afx.f(f(x)) or Afx.f(x) in accordance 
with Rule IV. 

We define the residuals of {\x.M}(N) after an application of I or II in 
the same way as before, and after an application of IV as what {\x.M}(N) 
becomes (the restrictions in IV ensure that it becomes something of the form 
{ry. P}(Q)). The residuals of (M,N) after an application of I, II, or IV 
are defined only in the case that M and N are in normal form and contain no 
free symbols other than 6. In that case the residuals of 6(M, N) are whatever 
part or parts of the entire resulting formula 6(M,N) becomes, except that 
after an application of IV which is a contraction of 5(M, N) itself, 6(M, N) 
has no residual. Thus residuals of 6(M, N) are always of the form 6(P,Q), 
where P and @Q are in normal form and contain no free symbols other than 6. 
We define a sequence of contractions on the parts {dx;.M;}(N;) and 
5(P;,Q;:) of A, where P; and Q; are in normal form and contain no free 
symbols other than 6, by analogy with our former definition. Similarly for a 
terminating sequence of such contractions. Then we prove Lemma 1 by an 
obvious extension of our former argument. Lemma 2 and Theorems 1 and 2 
then follow as before. Of course we replace “conv-I-II” by “conv-I-II-IV” 
in Lemma 2 and in general throughout the proofs of Theorems 1 and 2. In 
Lemma 1 we allow that the set of parts of A on which a sequence of contrac- 
tions is taken should include not only parts of the form {\x;.M;}(N,) but 
also parts of the form 6(P;,Q:) in which P; and Q; are in normal form and 
contain no free symbols other than 6. And in Lemma 2 we consider also the 
case that A imr B by a contraction on the part 5( P,Q) of A. 

We may also consider a third kind of conversion, namely the conversion 
that results if we modify Kleene’s definitions of well-formed, free, and bound 
by omitting the requirement that x be a free symbol of R from (3) of the defi- 
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nition (see Kleene, top of p. 530) and then modify Church’s rules I, II, III 
by using the new meanings of well-formed, free and bound and by omitting “if 
the proper symbol x occurs in M” from both II and III.* Then we call an 
application of the modified II together with applications of the modified I 
a reduction, and the reverse operation an expansion, and applications of all 
three rules conversion. Also we say that we contract {\x.M} (WN) if we replace 
it by SyM]|. If A has the form {Ax.M}(M, Ne, --- ,N,), then {Ax.M}(M) 
is said to be of order one in A and a contraction of {Ax.M}(N;) is said to be 
a reduction of order one of A. Then Lemma 1 and parts I and II of Lemma 
2 hold (although some modification is required in the proofs). Hence Theorem 
1 and its corollaries hold. But Theorem 2 is false. Instead a weaker form of 
Theorem 2 can be proved, namely: 


THEOREM 3. /f A has a normal form, then there is a number m such tha’ at 
most m reductions of order one can occur in a sequence of reductions on A. 


* The first kind of conversion which we have considered is essentially equivalent to a certain 
portion of the combinatory axioms and rules of H. B. Curry (American Journal of Mathematics, 
vol. 52 (1930), pp. 509-536, 789-834), as has been proved by J. B. Rosser (see Annals of Mathe- 
matics, (2), vol. 36 (1935), p. 127). In terms of Curry’s notation, our third kind of conversion can be 
thought of as differing from the first kind by the addition of the constancy function K. 

In dealing with properties of conversion, use of the Schénfinkel-Curry combinatory analysis 
appears in certain connections to be an important, even indispensable, device. But to recast the pres- 
ent discussion and results entirely into a combinatory notation would, it is thought, be awkward 
or impossible, because of the difficulty in finding a satisfactory equivalent, for combinations, of the 
notions of reduction and normal form as employed in this parer. 
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ON k-COMMUTATIVE MATRICES* 
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INTRODUCTION 


Deriniti0n 1. Jf A and B are two n Xn matrices, then the matrix 


k k 
(1) B, = A*B— + + (— 1)*BA* 


is the kth commute of A with respect to B. 
Evidently if we designate B by Bo, we have in general 
(2) = AB; — B;A (i = 0, ). 


The matrices B;, defined by these relations, have significance in the study 
of the Lie groups of infinitesimal rotations and have been studied by numer- 
ous writers. Particular attention is invited to the references I-XVII.{ In the 
present paper we shall study the commutes of a pair of matrices as a part 
of matric algebra and shall not attempt to interpret the significance the re- 
sults may have in modern physical theories. 


DEFINITION 2. The matrix A is k-commutative with respect to B, where A 
and B are nXn matrices, if the kth commute of A with respect to B is zero, 
whereas no commute of A with respect to B of index less than k is zero. 


DEFINITION 3. The matrices A and B of order n are mutually k-commuta- 
tive, if say A is k-commutative with respect to B and if B is at most k-commuta- 
tive with respect to A. 


If A and B are commutative in the usual sense, then they are mutually 
one-commutative. The quasi-commutative matrices defined by McCoy (XV) 
are mutually two-commutative in the sense defined above. 

In §1, we study general properties of the kth commutes of A with respect 
to B, with and without the restriction that A be k-commutative with respect 
to B. In §2, we study more particularly the structure of B, where A is as- 
sumed to be in the Jordan canonical form and is k-commutative with respect 
to B. The solution of the equation 


(3) AX — XA = uX 


* Presented to the Society, December 28, 1934; received by the editors April 15, 1935. 
t Roman numerals will refer to the references listed at the end of this paper. 
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is taken up in the third section as a special case under the more general equa- 
tion 
(4) X, = 


where X, is the kth commute of A with respect to X, and yw is a scalar con- 
stant. The equation (3) was studied by Killing (I) and Weinstein (IX), and 
equation (4) by Weyl (IV—VIII), and others (X—XV). Finally the results in 
the preceding sections are applied in the investigation of sets of anticommu- 
tating (XVI) and of semi-commutative matrices (XVII). 


1. GENERAL RESULTS ON k-COMMUTATIVE MATRICES 


For convenience in deriving results below, we shall employ a procedure 
given in some detail in an earlier paper by the writer (X XI); a brief résumé 
thereof will now be given. Let M=(m;;) be an Xp matrix; then M® is the 
1 Xp matrix obtained from M by placing its second row on the right of the 
first, its third on the right of the second, and so on. If N is a gXr matrix, 
then M(N)=(m,;N) is an nqXpr matrix, namely, the direct product of M 
and .V. The transpose of M will be designated by M?. Throughout, matrices 
will be designated by capital letters, and scalar quantities by lower case let- 
ters, save that R and T used as exponents indicate the transformations of 
matrices noted above. 

In accordance with these conventions equation (2) is equivalent to the 
unilateral equation 


(5) = BF[A] 


where [A | is the n? Xn? matrix A7(/)—I(A). The transformation of equation 
(2) to (5) is reversible. Equation (1) now takes the simple form 


(6) BE = 
If A is k-commutative with respect to B, we have, according to Definition 2, 
7) x0, h<k. 


The two following theorems are obvious results of definitions: 


THEOREM 1. /f A is k-commutative with respect to B, then all commutes, B,, 
of A with respect to B are zero forizk. 

THEOREM 2. If A is k-commutative with respect to B and to C, then A is at 
most k-commutative with respect to bB+cC, where b and c are scalar multipliers. 

We shall now prove 


THEOREM 3. Jf A is k-commutative with respect to B, then every scalar 
polynomial in A is at most k-commutative with respect to B. 
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Let the scalar polynomial in A be 
= aol + + +--+ + 
The proof of this theorem consists in showing that 
(8) BF[f(A)]* = 0, 
if (7) is satisfied. Obviously the transpose of f(A) is f(A7) and 


t 


[f(A)] = f(A?) — I(f(A)) = Da 


However 
[4‘] = [4] + (AHA) 


and we may therefore write 


[7(4)] = [A], 


where Q is an u?Xn? matrix and is commutative with [A]. Hence 


[f(A)]* = [A] 
Multiply this equation on the left by B® and (8) follows because of (7). 


THEOREM 4. If A is k-commutative with respect to Bo and if B; is the ith 
commute of A with respect to Bo, then the commutes B; (i=0, 1, 2,--- , k—1) 
are linearly independent matrices. 


Suppose that scalar constants a; (i=0, 1, - - - ,k—1), not all zero, exist 
such that 


then, according to (6), we have 
(9) {aol (I) + + = 0. 
Multiply the latter on the right by [A ]*-! and, according to (7), 
= = 0. 
However, by definition of k-commutative matrices, B,_.~0, hence ao=0. 
Similarly, if (9) be multiplied on the right by [A ]*~’, we find that a; must also 


be zero, and so on. This leads to a contradiction of the assumption that not 
all a; are zeros; the theorem is therefore proved. 


THEOREM 5. Jf A is k-commutative with respect to B and if the degree of no 
elementary divisor of A —XI exceeds a, then k<2a—1. 
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The matrix [A ]—)/(/) has at least one elementary divisor \2*—' and none 
of higher degree (XXI, Theorem 2), if that of highest degree of A —XI is 
(a—i)*. Therefore [A] satisfies the minimal equation 


+ + g 2a—1, 


where dp is not zero. Multiply this equation on the left by B”, and by (6) we 
conclude that 


Qo Bea-1 + + + = 0, 


where h is the lesser of the two integers g and k—1. If k exceeds 2a—1, this 
linear dependence between the commutes B; (i=2a—1, 2a,---, h) of A 
with respect to B cannot hold because of Theorem 4. Hence k<2a—1. 


Coro.iary 1. Jf A—XI has no elementary divisor whose degree exceeds a 
and if By, the hth commute of A with respect to B, is not zero for h>2a—1, then 
A is k-commutative with respect to B for no finite value of k. 


This corollary follows at once from the theorem above. We may remark 
that A is k-commutative with respect to no non-zero X satisfying equation 
(3), but on the other hand every such solution is two-commutative with re- 
spect to A and non-zero solutions of this equation may exist; we therefore 
can conclude that matrices B, such that A is k-commutative with respect to 
B for no finite k, do exist. 


Coro.iary 2. There exist no matrices A and B of order less than (k+1)/2 
such that A is k-commutative with respect to B. 


The degree of the elementary divisor of highest degree of A —AJ cannot 
exceed n. Hence by the theorem above, k <2m—1 in order that A be k-com- 
mutative with respect to B. The corollary is proved. McCoy (XV, p. 335) 
gave a more restrictive result than that of the present corollary in case A 
and B are mutually two-commutative; namely, that none of second order 
exist. However, second-order matrices exist such that A is two-commutative 
with respect to B, and B is not two-commutative with respect to A. Example: 


0, 1 a, b 
A =( ) B= ( ), where a ¢. 
0, 0 0, ¢ 


Corotary 3. If |A—AI| =(a—d)" and if the degree of no elementary di- 
visor of A—XI exceeds a, then A is k-commutative with respect to every matrix, 
B, of order n, and for any given B, k S2a—1. 

Weyl (VI, p. 100) originally gave this result. Under the hypotheses of 
the present corollary g=2a—1 and [A ]*-'=0 because [A ]—AJ(J) has ele- 
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mentary divisors only of the form \* where that of highest degree is \7¢-" 
(XX or XXI Theorem 2). Hence the (2a—1)st commute of A with respect 
to B is zero. 

An alternative statement of Theorem 5 is given by 


Coro.iary 4. If A is k-commutative with respect to B, and if the minimal 
polynomial satisfied by [A | is \6(A), where +0, then k<B. 

Heretofore we have considered the kth commute of A with respect to B 
only for positive values of k; however, in certain cases Definition 1 may have 
sense for negative indices as well. Thus the general solution, if it exists, of 
the equation 

X,=AX—XA=8B 
may be regarded as the (—1)st commute of A with respect to B; and the gen- 
eral solution, X, of the equation 
X; = B 


where X; is the ith commute of A with respect to X, is the (—i)th commute 
of A with respect to B. The latter equation is equivalent to 


X®[A]i = BR; 


if X, satisfying this equation, exists, it is not unique in that the number of 
linearly independent solutions is m?—r;, where 7; is the rank of [A ]‘. Hence 
according to the well known theory of linear non-homogeneous equations the 
following theorem holds: 


THEOREM 6. /f A and B are given matrices of order n, then the (—i)th com- 
mute of A with respect to B exists, i>0, if and only if the matrices 


uk 


have the same rank, r;, and the number of linearly independent (—i)th commutes 
of A with respect to B,i>0, is n*?—r;. 


THEOREM 7. If A is k-commutative with respect to B and if B; is the ith 
commute of A with respect to B, then 


1 1 
S(A)B = Bf(A) + Bif(A) + PA) 


(k — 1)! 
(10) 1 (- 1)*-! 
Bf(A) = f(A)B — f(A) Bi + 
where f(r) is a scalar polynomial in d and f‘(d) its ith derivative with respect 
tox. 
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Obviously 
AT™(I) = [A] + 
and 


(AT(I))" = (A*)*(I) = ([A] + 


but 7(A) and [A] are commutative matrices and the right member above 
may therefore be expanded according to the binomial theorem. Multiply the 
result on the left by B*; then 


= B8I(A") + (‘) +--+ + BP 


A'B = BAT + (‘) + () +--+ + B,. 


This relation is equivalent to that derived by Campbell (III, §2), and from 
it the first identity of the theorem above follows at once. Similarly on the 
basis of [(A)=A7(I)—[A ], we can readily prove the second also. The theo- 
rem can be generalized to apply for more general functions f(A), and if A is 
not assumed to be k-commutative with respect to B the formulas still hold 
save that the right members will not stop with the &th term. 

If A =(a;;) is an mXn matrix whose elements a;; (i, 7=1, 2,---,m) are 
differentiable functions of ¢, we have 


dA’ r r r 
= +( ) Aran, 
dt 1 2 k 


om (‘) Art-14 + ( ans 
1 1 412 k ky 


where A, =(da;;/dt), where A; (i=2, 3, --- , k) is the (¢—1)st commute of A 
with respect to its derivative, A,, and where A is k-commutative with re- 
spect to A;. These formulas may readily be established by mathematical in- 
duction. In case A is commutative with its derivative, the right members re- 
duce to the usual result for scalar quantities. 

If f(A) is a scalar polynomial (or convergent power series) in A, we readily 
obtain the following identities: 


(11) f(A) +> PA f(A), 


or 
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(- 1) 
k! 


1 


where A is k-commutative with respect to its derivative. 


THEOREM 8. If A is (k+1)-commutative with respect to X and if the first 
commute of A with respect to X is equal to the derivative, A,=(da;;/dt), of A, 
then 


d 
= f(A)X — Xf(A), 


where f(d) is a function of such that f(A) converges for all values of t in the 
interval under consideration. 
By hypothesis, 
A, =AX — XA. 
If in the first formula (11) we add Xf(A) —Xf(A) to the right member and 
compare the result with (10) we have the result of the theorem above. The 
restrictions that f(A) be a polynomial in A and that A be k-commutative 


with respect to its derivative, A1, may be removed provided proper bounds 
may be placed upon the elements of A to insure the convergence of f(A). 


2. MORE EXPLICIT FORM OF B 


We shall now derive restrictions upon the form of B, where that of A is 
known and where A is k-commutative with respect to B. In the present sec- 
tion and hereafter we shall discontinue the use of subscripts to indicate the 
commutes of a matrix pair unless the contrary is specifically stated. 


THEOREM 9. If 
where the m;Xm; matrix A; (i=1,2,---, 7) has a unique characteristic value 
a; and a;#a;, if ixj, and if A is k-commutative with respect to B=(B;;), where 
B;; (i, 7=1, 2,---,7r) are m:Xm,; matrices, then 
B = By + Bo + + B,,, 
and A; is at most k-commutative with respect to B;; 


It is no restriction to assume that A has the form given above, for by a 
suitable non-singular transformation it can be brought into this form. Since 
* A matrix M=(M;;), where Mj; are m;Xmj; matrices and where all M;;=0, if i¥j, is here and 


in the following pages denoted by the notation M=Mu+Mn+ --- +Mu. A single subscript on the 
matrices M;,; is sufficient in many cases. 
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A is k-commutative with respect to B, the matrix (1) must be zero and we 
consequently have the r? equations 


k k 
AF Bud? “ece 1)*B;;AF = 0 


(i, 7=1, 2,---,7). These equations must be satisfied by the matrices B,; in- 
dependently. In the unilateral form they become 


(12) 48 j (i,j = 1,2,---,7), 
where 
[Ai, = AP(I;) — 

and J, are mz Xm, unit matrices. Each of the r? equations (12) is equivalent 
to a system of m,m; linear homogeneous equations in the m,m; elements of 
B,;, the matrix of whose coefficients is [A;, A;]*. This matrix is singular if 
and only if a;=a; (XVIII-XXI). Therefore B;;=0, if ij, and in case i=7 
we see by (12) that A; is at most k-commutative with respect to B,; 
(t=1, 2,---,7). This well known result concerning matrices which are com- 


mutative in the ordinary sense holds as well for k-commutative matrices. The 
following theorem is still more precise in defining the structure of B. 


THEOREM 10. Jf 


where A;=a,l;+D; and I; and D; are respectively the unit and the auxiliary 
unit matrices* of order n;, and if A is k-commutative with respect to B=(B;;), 
where B;; are n; Xn; (i, j7,=1, 2,---, s) matrices, then B;;=0, if a;~a;, and 
if a;=a;, Bi; has zero elements in at least the first {n:, n;} —k diagonals, where 
{n;, n,} is the greater of the integers n; and n;.t 


As in the proof of Theorem 9, we have 
= 0 
and B;,=0, if a;4#a;. However, in case a;=a,;, 
[A:i, A;] = [Di, D,]. 
Hence 
* The auxiliary unit matrices D; of order n; are here understood to have n;—1 unit elements in 
the first diagonal above the principal diagonal and to have zero elements elsewhere. 


+ Diagonals are here numbered consecutively beginning with that containing the lower left 
element of the blocks B,;. 
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or 


k k 
Dk By; + +(— 1)°B,D} = 0. 


Let 


C 
Bi; ( ‘), 
C3; C4 


where Ci, C2, C3, Cs are respectively aX(n;—8), aXB, (ni—a) X(n;—8), 
(n;—a) XB matrices; then 


0 C; 
De B;;D? ( 
0 0 


Because of this fact we can conclude that B;;, which satisfies (13), must have 
only zero elements in at least the first {,, 2;} —k diagonals where {m;, »;} 
is the greater of the integers n; and m;. This proves the theorem. 

However, in case n; =n; and a;=a;, we can show that the elements in the 
(n;—k+1)st diagonal of B;; are likewise zeros provided k>1, since in this 
case these elements must satisfy linear homogeneous equations with non- 
zero determinants. This fact, together with the form of B as demonstrated 
above, leads us to the following theorem: 


THEOREM 11. A —XJ has the elementary divisors (a;—)" (t=1,2, - -,5), 
if A is two-commutative with respect to B, and if n;~n;+1 in case a;=a,, then 
the characteristic values of f(A, B), where f(r, wu) is a scalar polynomial in d 
and yw, are in the set f(a:, bx) where b, (h=1, 2, --- , t) are the distinct charac- 
teristic values of B. 


Under the hypotheses of this theorem we add no restrictions upon A and 
B if we assume that A is in the Jordan canonical form given in Theorem 10. 
The matrix B will be an umbral matrix (XXII), whose blocks B;; are zero 
in case a;~a;, and therefore with A has the property stated in the theorem 
above, which we shall designate as the property P. 

In case A and B are mutually two-commutative, McCoy (XV, Theorem 
5) shows that the third hypothesis of the theorem above may be omitted. 
The property P does not carry over to mutually k-commutative matrices, 
where k exceeds 2. For example, the matrices 


W. E. ROTH 


0 b O 
0 0 a 0-06 
B 

0 4e a 
0 O 0 O 3e a) 


are mutually three-commutative and the characteristic values of A+B are 
not those of B. Therefore ordinary commutative matrices and the quasi-com- 
mutative matrices of McCoy are the only types of mutually k-commutative 
matrices which necessarily have the property P. 


3. THE EQUATION X;,=y*X 

Evidently every matrix X which satisfies the equation 
(3) AX — XA = 
will likewise satisfy the equation 
(4) = 
where X; is the kth commute of A with respect to X and yp is a non-zero 
scalar constant. On the other hand not all solutions of (4) satisfy (3). We shall 
confine our attention to (4). 


We may without restrictions upon the problem assume that A is in the 
Jordan canonical form 


+4, 


where A ;=a,J;+D, and J; and D; are respectively the unit and the auxiliary 
unit matrices of order m;. Under these assumptions the elementary di- 
visors of are (a;—A)" (¢=1, 2,---, s). Let X=(Xi;), where Xj; 
(i, 7=1,2,---,s) are n;Xn; matrices; then (4) is equivalent to the s? equa- 
tions 

(14) —w =0 =1,2,---,5)- 


The necessary and sufficient condition that X,; be a non-zero matrix is that 
the n.n;Xnm; matrix 


(15) [A:, A;]* — (i,7 =1,2,---,5) 


be singular. It has the characteristic value (a;—a;)*—* repeated nn; times 
(XVIII-XXI). Hence the necessary and sufficient condition that X;; be a 
non-zero matrix is that (a;—a,;)*—u*=0. Moreover, since 4~0, we have 
X;;=0 (¢=1, 2,---, 5); that is, the trace of X, any solution of (3), is zero 
(compare IV). These properties of X are invariants under the usual transfor- 
mations of matrices to normal form. Hence we have the theorems below. 
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THEOREM 12. The necessary and sufficient condition that equation (4) have 
a solution other than the trivial solution X =0 is that A have at least two charac- 
teristic values a and b such that (a—b)* =p*. 

This result was obtained by Weinstein (IX) for the case k = 1. 

THEOREM 13. The trace of every solution, X, of (4) is zero. 

We shall now prove 


THEOREM 14. Jf A —XI has the elementary divisors (a;—)” (i=1,2, - - -,5), 
and if X is a solution of (4), where k is an odd integer, then X is a nil-potent 
matrix if it is possible so to arrange the characteristic values a; of A that 


a; — a;\* 
( for i>j. 


In this case all X;;, 727, are zero, and all non-zero X;;, if such exist, lie 
above the principal diagonal of X. That is, X is a nil-potent matrix. 

We shall now expose the exact form of X;; in case (a;—a;)*/u*=1. The 
matrix (15) in this case becomes 


{ (as — — N}* — 


k k 
= (a; — a;)*"*N +(5) (a; — 


where N = [D;, D;]. Let the right member be given by NQ; then Q is a non- 
singular matrix since N is nil-potent. The equation (14) consequently becomes 


XN =0, 
or 


(16) D;X;; X;;D; = 0. 


This is the well known relation which arises in the study of matrices X com- 
mutative with the Jordan canonical matrix A, save that in the present case 
(16) holds if (@;—a,;)* =y*, and not if a;=a; as in case A and X are commuta- 
tive. Therefore X =(X;;) (¢, 7=1, 2, - - - , s), a solution of (4), is such that in 
case (a;—a,)*=y*, X;; has zero elements in the first {n;, 2;}—1 diagonals, 
and the elements in each of the remaining diagonals of X;; are all equal but 
arbitrary and independent of those of another diagonal. If (a;—a;)* yu", then 
X ;;=0. From the structure of X here discussed, the following theorem is at 
once evident, since if it is not satisfied then X will have at least one row or 
column of zero elements and will be singular. 
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THEOREM 15. Jf A —XI has the elementary divisors (a;—d)™ (i=1,2, ---,5), 
then the necessary and sufficient condition that the equation (4) have a non-singu- 
lar solution X is that for every i (i=1, 2,---, 5) there exist at least one j 
(j=1, 2,---, 5s), and for every j there exist at least one i, such that 


nm; =n; and (a;—a;)*=y* (i,j =1,2,---,5). 


If the matrix A is in the Jordan canonical form, then X, a solution of (4), 
is an umbral matrix (XXII, Definition 3) and consequently 


[|X| =| Xa] 


where X;,=(X;;) 8, - - -,) andi,7 run over only those values for which 
N;=N;=Ny, and Ng, Ng, - , are the distinct values of m; (¢=1, 2,---, 5). 
(See XXII, Theorem III.) This fact makes the restriction »;=n; and 
(a;—a,)*=y* a necessary one, else |X| =0. 

4. SETS OF SEMI-COMMUTATIVE MATRICES 


If A and B satisfy the relations 
(17) -AB=wBA and A* = Bt =I, 


where w is a primitive kth root of unity, they have been called semi-commuta- 
tive by Williamson (XVII). On the basis of the first equation (17) we can 
readily show that 


B; = (w — 


where B; is the ith commute of A with respect to B. Consequently, because 
of the second restriction upon A in (17), we have 


(18) By = (w — 1)*B. 
This proves 
THEOREM 16. If A is a member of the set of semi-commutative matrices, 
then a second member of that set is a solution of the equation 
X, = (w — 1)#X, 
where X,, is the kth commute of A with respect to X. 


The theory developed in §3 is applicable in this section; however, the re- 
sults there obtained are more general than necessary in the present case. A 
special study of (17) is superfluous in view of Williamson’s results (XVII). 
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ON THE ORDER OF GROUPS OF AUTOMORPHISMS* 


BY 
GARRETT BIRKHOFF anp PHILIP HALL 


1. Introduction. Consider the following problem. Let G be any group of 
finite order g, and let A denote the group of the automorphisms of G. What 
can one infer about the order a of A, simply from a knowledge of g: in other 
words, to what extent is @ a numerical function of g? 

The main known result relating to this problem is due to Frobenius. It 
limits the orders of the individual elements of A in terms of g, and hence tells 
which primes can be divisors of a. 

The present paper is independent of the work of Frobenius, and presup- 
poses only the theorems of Lagrange and Sylow. Its main result is the follow- 
ing 

THEOREM 1. Let G be any group of finite order g. Let 0(g) denote the order 
of the group of the automorphisms of the elementary Abelian group of order g, 
and let r denote the number of distinct prime factors of g. Then the order a of 
the group A of the automorphisms of G is a divisor of g’—'0(g). 


The function 6(g) is computed numerically from g as follows. Write g 
as the product p;""p." - - - p,"* of powers /p,”* of distinct primes. Then 


= — p)- (pe — pe”) 


np (ng—1)/2 


= pi — De 


For example, 6(12) = 0(3)0(4) =2- (3-2) =12. 

One can strengthen Theorem 1 in special cases, by 

THEOREM 2. If G is solvable, then a is a divisor of g6(g). 

THEOREM 3. If G is “hypercentral,” that is, the direct product of its Sylow 
subgroups, then a is a divisor of 0(g). 

2. Preliminary lemmas. The following two statements are immediate 
corollaries of Lagrange’s and Sylow’s Theorems, respectively: 


* Presented to the Society, December 26, 1933; received by the editors August 20, 1935. 
+ Uber auflésbare Gruppen, U1, Berliner Sitzungsberichte, 1895, p. 1030. Cf. Burnside’s Theory of 
Groups, 1st edition, pp. 250-252. 
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Lema 1. Let H be any group whose elements induce automorphisms homo- 
mor phically (i.e., many-one isomor phically) on a second group G. Then the index 
in H of the subgroup “centralizing” G (i.e., leaving every element of G invariant) 
divides the order of the group of the automorphisms of G. 


Lemma 2. Let G be any group, and r any positive integer. If the order of 
every prime-power subgroup of G divides r, then the order of G divides r. 


As a further preliminary step, it is well to verify the somewhat less obvi- 
ous 


Lemma 3. Let P be any group of prime-power order p", inducing substitu- 
tions homomorphically on r= pq letters x, --- , x, [p* the highest power of p 
dividing r|. Then there is a letter x, such that, if S denotes the subgroup of sub- 
stitutions of P which omit x,., the index of S in P divides r. 


Let S; denote that subgroup of P whose substitutions omit the letter +;; 
by Lagrange’s Theorem, the index of S; in P is a power p** of p. Hence the 
transitive system including x; contains exactly p*® letters. But the sum of 
the numbers of letters in the different transitive systems is not a multiple 
of p*+!; hence for some i Setting S;=S;,, we have Lemma 3. 


Lema 4.7 Let G be any group of prime-power order p”. Then the order a 


of the group A of the automorphisms of G divides 0(p") =(p"—1)(p"—p) --- 
pr), 


By Lemma 2, it is sufficient to prove the result for every subgroup Q of A of 
prime-power order g”. But given Q, one can define Q:>Q2>Q;> --- >Q,=1 
and $,;<S:<S;< --- <S,=G recursively as follows: 


(1) Q: is the group Q. 

(2) Given Q;, S; is the subgroup of the elements of G “centralized” by Q; 
(i.e., invariant under every automorphism of Q;). 

(3) Given Q; and S;, Qi.41 is a proper subgroup of Q; whose index in Q, 
divides the number of elements in G—S,. 


The only questionable point in the existence of these subgroups concerns the 
possibility of (3); this is ensured by Lemma 3. 

Moreover multiplying together on one side the indices of the Q,4: in the 
Q;, and on the other their multiples, the degrees of the G—S;, one sees that 
q” divides the product of those factors (p"— p*) corresponding to the orders 


+ A more delicate result implying this, but presupposing a study of the structure of groups of 
prime-power order, is given by P. Hall in A contribution to the theory of groups of prime-power order, 
Proceedings of the London Mathematical Society, vol. 36 (1933), p. 37. 
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of complexes G—S,. Hence a fortiori g” divides 0(p"), and the lemma is 
proved. 

3. Proof of principal theorem. We are now in a position to prove Theo- 
rem 1. 

Accordingly, let G be any group of finite order g, let g=p,." - - - p,"", let 
6(g) denote the order of the group of the automorphisms of the elementary 
Abelian group of order g, and let A (of order a) denote the group of the auto- 
morphisms of G. 

By Sylow’s Theorem, G contains subgroups S/ of orders [¢=1, - - -,7; 
j=1,---,s;]. By Sylow’s Theorem also,f s; is the index in G of the “normal- 
izer” of any S; (i.e., the set of elements aeG such that aS,‘ =S,‘a); hence, by 
Lagrange’s Theorem and the fact that S; is contained in its own normalizer, 
s; divides g/p,"*. 

Again, the automorphisms of G obviously permute the S;* of given order 
pi‘ homomorphically. Therefore, by iterated use of Lemma 3, any subgroup 
Q of A of prime-power order g” contains a subgroup Q, whose index in Q 
divides the product ]];-:(g/p:*) =g’—, and which normalizes at least one S},) 
of each order p;"'. But by Lemma 1 and iterated use of Lemma 4, Q, has a 
subgroup Q* whose index in Q; divides 6(g), and which “centralizes” 

Sj) [Le., leaves every element of these subgroups of G invariant]. 
But the Sha generate G; hence Q* contains only the identity, and g” divides 
g’10(g). 

Theorem 1 now follows from Lemma 2 and the fact that Q was permitted 
to be an arbitrary group of prime-power order. 

4. Special cases of solvable and hypercentral groups. The proofs of Theo- 
rems 2—3 are now immediate. 

In fact, Theorem 3 is really a corollary of Lemma 4. For the Sylow sub- 
groups of a hypercentral group are characteristic. Denoting them by 
Si,---, S,, one sees immediately that the group of the automorphisms of 
G is the direct product of the groups of the automorphisms of the S;, 
making the theorem obvious. 

To prove Theorem 2, suppose that G is solvable, and use the stronger 
known result,f analogous to Sylow’s Theorem, that G contains subgroups of 
every index p,”*. Now in the proof of Theorem 1 presented in §3, if g does 
not divide g, it is numerically evident that g” divides 6(g). Hence, by Lemma 
2, it is sufficient to show that if g divides g, then g” divides g6(g). 


t More particularly, the part that states that the inner automorphisms of G are transitive on 
the Sylow subgroups of any fixed order 

t Cf. P. Hall, A note on soluble groups, Journal of the London Mathematical Society, vol. 3 
(1928), p. 99. 
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But to say that g divides g is evidently to say that g=/; for suitable k; 
without loss of generality, we can assume k=1. In this case Q normalizes 
some Sylow subgroup S of G of order p;"; this follows from Lemma 3 and the 
fact that the number of Sylow subgroups of order p:", being a divisor of 
po" - - - p,"*, is not divisible by g. Moreover Q has a subgroup Q; whose index 
in Q divides g™ [and hence g] which “normalizes” (i.e., leaves invariant) a 
subgroup H of order p,"* - - - p,"* (and index :") in G; this follows from 
Lemma 3 and the fact that by Hall’s Theorem cited above, the number of 
such subgroups Z is a divisor of p:". 

Finally, by Lemmas 1 and 4, the index in Q; of the subgroup Q2“centraliz- 
ing” S divides 6(q"!). And by induction on g, the index in Q, of the subgroup 
Q* “centralizing” H divides (p." - - - p,"r)-0(p."* - - - p,""), or, since it is by 
Lagrange’s Theorem a power of g=/, and relatively prime to p."- - - p,”, 
it divides 0(p."*--- p,"7). But S and H, if only by Lagrange’s Theorem, 
generate G; hence Q*=1. Combining, one sees that if g divides g, then g” 
divides g0(pi")0(p."? - - - p,"r), that is, g0(g). But this is just what we wished 
to prove. 

5. Possible improvement of results. It is natural to ask what likelihood 
there is of improving the results expressed in Theorems 1-3. 

It is well known that the least upper bound to the possible values of a 
for fixed g is at least 6(g); this is shown by the elementary Abelian group of 
order g. Consequently Theorem 3 is a best possible result. Moreover in gen- 
eral 0(g) is not a common multiple for the possible values of a, as is shown by 
the dihedral group of order six and many other groups of similar structure. 

On the other hand, there is no known example of a group for which a 
fails to divide g@(g) ; this suggests the possibility of replacing g’'0(g) in Theo- 
rem 1 by g0(g), and omitting Theorem 2 altogether. 

This leaves the determination of lower bounds and common divisors of @ 
in terms of g unattempted. The cyclic groups of order g should throw con- 
siderable light on this more trivial question. 

Also, the case in which G is simple would probably repay study. 
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CORRECTION TO THE PAPER “A PROBLEM CONCERN- 
ING ORTHOGONAL POLYNOMIALS’”* 


BY 
GABRIEL SZEGO 


Professor Walsh has called to my attention the fact that on page 197 
of this paper “| D(z) |?=(z), son C,” should be replaced by “| D(z) |?=ym(z), 
zon Ci, y>0.” 


* These Transactions, vol. 37 (1935), pp. 196-206. 
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